Jump to content

Upgraded from 6.11.5 to 6.12.0 and then had to revert back to 6.11.5 - Server crashes every 1-8 hours


Recommended Posts

nzwhs01-diagnostics-20230620-1350.zip

 

So not sure what is going on... prior to the upgrade, I was getting 1.5-2 months of uptime. After the upgrade, it went down to less than 5 hours. I tried running in safe mode and that did not help.

I since downgraded to 6.11.5 and had some issues with some docker images so I reinstalled the problematic ones and did not see any more errors when starting them.
External syslog is not showing any issues just prior to the crashes.
Mirroring syslog to flash did not show anything.

 

I have attached my diagnostics here in hopes that someone may have an idea of what i should do at this point?

Link to comment

I will enable the write to flash again and remain on 6.11.5. I do not want to introduce additional variables into this... like I mentioned:
1 - Uptime of 1.5-2months
2 - Upgraded to 6.12.0 from 6.11.5 (used the update assistant)
3 - Upgrade went seemingly fine... let it soak... and was not getting more than 5 hours of uptime
4 - ran in safe mode - 5 hours max uptime
5 - through all of this, setup external syslog server and logs not showing anything prior to server reset

 

I am pretty sure it will happen again... until then I will set the logs to write to flash and then just wait.

Link to comment

image.thumb.png.184b0caf097b4a883004e7e7ecd86242.png

Each one of these represents a server reset. No real discernable pattern. And it seems that the stability, whether on 6.11.5 or 6.12.0 is the same.

Question for the community.... should I move back to 6.12.0 and work from there as the stability seems to be the same between 6.11.5 and 6.12.0. Sadly, it was so stable before and now the rollback has kept the instability.... 😞


@JorgeB, what am I testing? So I have about 20 running containers.... if each reboot is at about the 5 hour mark, am i really going to do this for the next 100 hours?

Link to comment

Do you think I should remain on 6.11.5 or upgrade to 6.12.0? I ask because I don't think its more or less stable and at this point wondering where I would get the best community support. Not sure if I should be going to ipvlan versus macvlan(I am currently on macvlan) as I do not seem to see any kernel panics at the moment....

Link to comment

Flipped back to macvlan and then stopped all containers and the docker service, ran the Unraid Update assistant... came back clean... then Updated to 6.12.1 without issues. restarted Docker Service and all the containers are up and running again...

Waiting to see if I can pass 5 hours as such. If not, then before I goto bed, I will stop the docker service as suggested and see if I can get more than the 4-5 hours of uptime

Edited by bullmoose20
Link to comment

I additionally turned off the VM service. Now it’s a wait and see. So both docker and VM service is turned off. Array is still running but basically doing nothing.

 

Next will be safe mode to remove possibility of plugins causing the reboot. But I will wait to see if system reboots with both VM service and Docker service turned off before booting to safe mode.

Edited by bullmoose20
Link to comment

In my case, the most stability that I have is the following and going on 6 hours and 6 minutes of uptime:

  1. VM service off
  2. Docker service off
  3. Rebooted to safe mode(effectively ruling out all plugins), added passphrase for encrypted disks so file systems and disks mount

Since I do not use the server shares in this way and I must have my containers and vm's running... this is not really a workaround for me. Waiting for guidance on what I should do next.... 

 

Like at this point is looks like maybe a bad plugin is causing the server to reset. Nothing obvious in the syslog. So not even sure if there is an option to increase logging to possibly catch the issue? my iLo4 board just sees the server reset... no power cut, nothing... no hardware errors... nothing...

Link to comment

Update! currently at 9h15minutes of uptime and server has not rebooted randomly. So there might be a plugin that is causing me grief in 6.12.1.

 

Suggestions?
Should I:
a - stay in safe mode with plugins not running and turn on docker service which will then start up all my containers?
or
b - reboot to get out of safe mode, leave Docker service off, start enabling plugins (assuming I can even do that) 1 by one and wait on each plugin to see?
or 
c - something else

Link to comment

Join the conversation

You can post now and register later. If you have an account, sign in now to post with your account.
Note: Your post will require moderator approval before it will be visible.

Guest
Reply to this topic...

×   Pasted as rich text.   Restore formatting

  Only 75 emoji are allowed.

×   Your link has been automatically embedded.   Display as a link instead

×   Your previous content has been restored.   Clear editor

×   You cannot paste images directly. Upload or insert images from URL.

×
×
  • Create New...