bullmoose20 Posted June 20, 2023 Share Posted June 20, 2023 nzwhs01-diagnostics-20230620-1350.zip So not sure what is going on... prior to the upgrade, I was getting 1.5-2 months of uptime. After the upgrade, it went down to less than 5 hours. I tried running in safe mode and that did not help. I since downgraded to 6.11.5 and had some issues with some docker images so I reinstalled the problematic ones and did not see any more errors when starting them. External syslog is not showing any issues just prior to the crashes. Mirroring syslog to flash did not show anything. I have attached my diagnostics here in hopes that someone may have an idea of what i should do at this point? Quote Link to comment
JorgeB Posted June 20, 2023 Share Posted June 20, 2023 Diags are from v6.11.5, so not much to see. 13 minutes ago, bullmoose20 said: Mirroring syslog to flash did not show anything. Post this if you have it. Quote Link to comment
bullmoose20 Posted June 20, 2023 Author Share Posted June 20, 2023 I will enable the write to flash again and remain on 6.11.5. I do not want to introduce additional variables into this... like I mentioned: 1 - Uptime of 1.5-2months 2 - Upgraded to 6.12.0 from 6.11.5 (used the update assistant) 3 - Upgrade went seemingly fine... let it soak... and was not getting more than 5 hours of uptime 4 - ran in safe mode - 5 hours max uptime 5 - through all of this, setup external syslog server and logs not showing anything prior to server reset I am pretty sure it will happen again... until then I will set the logs to write to flash and then just wait. Quote Link to comment
bullmoose20 Posted June 20, 2023 Author Share Posted June 20, 2023 Last message is highlighted before machine reset. Logs coming as soon as I get access to file system Quote Link to comment
bullmoose20 Posted June 21, 2023 Author Share Posted June 21, 2023 syslog (2) Here is the flash syslog and the most recent diag nzwhs01-diagnostics-20230620-2006.zip Quote Link to comment
otakunorth Posted June 21, 2023 Share Posted June 21, 2023 I am having the same issue but server only looks like it's crashing the gui just dies, dockers are still running in the background. More people with this issue here Quote Link to comment
JorgeB Posted June 21, 2023 Share Posted June 21, 2023 Try booting with docker service disabled, if OK start enabling the containers one by one and re-test. Quote Link to comment
Fromnack Posted June 21, 2023 Share Posted June 21, 2023 I was experiencing a very similar issue when I came across some threads which suggested changing the docker network type from "macvlan" to "ipvlan", ever since then I've not had a single crash. Hope this helps! Quote Link to comment
bullmoose20 Posted June 21, 2023 Author Share Posted June 21, 2023 Each one of these represents a server reset. No real discernable pattern. And it seems that the stability, whether on 6.11.5 or 6.12.0 is the same. Question for the community.... should I move back to 6.12.0 and work from there as the stability seems to be the same between 6.11.5 and 6.12.0. Sadly, it was so stable before and now the rollback has kept the instability.... 😞 @JorgeB, what am I testing? So I have about 20 running containers.... if each reboot is at about the 5 hour mark, am i really going to do this for the next 100 hours? Quote Link to comment
JorgeB Posted June 21, 2023 Share Posted June 21, 2023 I don't have a better idea, maybe someone else will, you could at least start by running it with the docker service disable to see if it's related to docker or not. Quote Link to comment
bullmoose20 Posted June 21, 2023 Author Share Posted June 21, 2023 Do you think I should remain on 6.11.5 or upgrade to 6.12.0? I ask because I don't think its more or less stable and at this point wondering where I would get the best community support. Not sure if I should be going to ipvlan versus macvlan(I am currently on macvlan) as I do not seem to see any kernel panics at the moment.... Quote Link to comment
JorgeB Posted June 21, 2023 Share Posted June 21, 2023 If the issue is the same in both it should also be the same to troubleshoot using either one. Quote Link to comment
bullmoose20 Posted June 21, 2023 Author Share Posted June 21, 2023 OK. I have switched to ipvlan and will wait and see... Quote Link to comment
bullmoose20 Posted June 21, 2023 Author Share Posted June 21, 2023 If I stop docker... then the server is 100% going to be doing nothing..... I only run containers Quote Link to comment
bullmoose20 Posted June 21, 2023 Author Share Posted June 21, 2023 I also see that 6.12.1 was released with a new kernel and bugfixes... If the ipvlan for docker does not help (trying to get more than 24 hours of up time), I will likely update to 6.12.1 so at least it won't be a question of being behind on unraid version. Quote Link to comment
bullmoose20 Posted June 21, 2023 Author Share Posted June 21, 2023 OK. setting it to ipvlan in the docker settings did not help... system reset again... going to update to 6.12.1. Quote Link to comment
bullmoose20 Posted June 21, 2023 Author Share Posted June 21, 2023 (edited) Flipped back to macvlan and then stopped all containers and the docker service, ran the Unraid Update assistant... came back clean... then Updated to 6.12.1 without issues. restarted Docker Service and all the containers are up and running again... Waiting to see if I can pass 5 hours as such. If not, then before I goto bed, I will stop the docker service as suggested and see if I can get more than the 4-5 hours of uptime Edited June 21, 2023 by bullmoose20 Quote Link to comment
bullmoose20 Posted June 22, 2023 Author Share Posted June 22, 2023 (edited) So server rebooted two times overnight. usually around the 4 hour uptime mark. i then turned off docker service and the server rebooted at 915am and then again at 1545. no apparent reason... Edited June 22, 2023 by bullmoose20 Quote Link to comment
bullmoose20 Posted June 22, 2023 Author Share Posted June 22, 2023 (edited) I additionally turned off the VM service. Now it’s a wait and see. So both docker and VM service is turned off. Array is still running but basically doing nothing. Next will be safe mode to remove possibility of plugins causing the reboot. But I will wait to see if system reboots with both VM service and Docker service turned off before booting to safe mode. Edited June 22, 2023 by bullmoose20 Quote Link to comment
bullmoose20 Posted June 23, 2023 Author Share Posted June 23, 2023 (edited) 3 reboots later… I will now return n in safe mode to eliminate plugins. VM service off Docker Service off Safe Mode (to eliminate plugins) So effectively the only thing the server is doing is mounting the drives in the array. Edited June 23, 2023 by bullmoose20 Quote Link to comment
bullmoose20 Posted June 23, 2023 Author Share Posted June 23, 2023 (edited) Sharing some files in case someone sees something....syslog (3)nzwhs01-diagnostics-20230623-0931.zip No reboots left but since teh last reboot to safe mode was at around 8:45, the next unexpected reboot should be around 12pm-1pm (Around the 4 hour mark. Edited June 23, 2023 by bullmoose20 Quote Link to comment
Noim Posted June 23, 2023 Share Posted June 23, 2023 On 6/21/2023 at 2:32 AM, otakunorth said: I am having the same issue but server only looks like it's crashing the gui just dies, dockers are still running in the background. More people with this issue here As far as I can tell, it really crashes for me. The server in general doesn't even respond to pings after the crash. franz-diagnostics-20230623-2012.zip 1 Quote Link to comment
bullmoose20 Posted June 23, 2023 Author Share Posted June 23, 2023 In my case, the most stability that I have is the following and going on 6 hours and 6 minutes of uptime: VM service off Docker service off Rebooted to safe mode(effectively ruling out all plugins), added passphrase for encrypted disks so file systems and disks mount Since I do not use the server shares in this way and I must have my containers and vm's running... this is not really a workaround for me. Waiting for guidance on what I should do next.... Like at this point is looks like maybe a bad plugin is causing the server to reset. Nothing obvious in the syslog. So not even sure if there is an option to increase logging to possibly catch the issue? my iLo4 board just sees the server reset... no power cut, nothing... no hardware errors... nothing... Quote Link to comment
bullmoose20 Posted June 23, 2023 Author Share Posted June 23, 2023 1 hour ago, Noim said: As far as I can tell, it really crashes for me. The server in general doesn't even respond to pings after the crash. franz-diagnostics-20230623-2012.zip 121.39 kB · 1 download I see btrfs errors littered in your syslog. you could have a bad disk Quote Link to comment
bullmoose20 Posted June 23, 2023 Author Share Posted June 23, 2023 Update! currently at 9h15minutes of uptime and server has not rebooted randomly. So there might be a plugin that is causing me grief in 6.12.1. Suggestions? Should I: a - stay in safe mode with plugins not running and turn on docker service which will then start up all my containers? or b - reboot to get out of safe mode, leave Docker service off, start enabling plugins (assuming I can even do that) 1 by one and wait on each plugin to see? or c - something else Quote Link to comment
Recommended Posts
Join the conversation
You can post now and register later. If you have an account, sign in now to post with your account.
Note: Your post will require moderator approval before it will be visible.