Random Crashes


Go to solution Solved by Chelun,

Recommended Posts

Hello,

I am experiencing some random crashes of the unraid server and I can not figure out the issue. This had been going on since I moved to the new hardware.

Unraid 6.12.8

System specs are:

- Supermicro X11SCA-F 

- I3-8100

- 32G ram ECC

- 2 NVME (cache)

- 3 8TB drives (data)

- 10G Network card (intel x520)

 

Now the server had been up for 1 day 10 hours. There is no spacial thing that I am doing when it crashes also I only have a hand full of Dockers running, nothing crazy. 

 

Let me attach the Diagnostics and syslogs (the system crashed twice or more since I enabled the syslogs)

Thank you in advance.

nasty-diagnostics-20240303-2119.zip syslog.log

  • Upvote 1
Link to comment

Unfortunately there's nothing relevant logged, this usually points to a hardware issue, one thing you can try is to boot the server in safe mode with all docker containers/VMs disabled, let it run as a basic NAS for a few days, if it still crashes it's likely a hardware problem, if it doesn't start turning on the other services one by one.

Link to comment

4 Days and 4 hours up in safe mode. Record for sure!!

I have Docker disabled, no VMs, and the following plugins:

  • community.applications.plg

  • docker.patch2.plg

  • dynamix.file.manager.plg

  • fix.common.problems.plg

  • tasmotapm.plg

Also as Docker I have:

  • The binhex-RR stack (Radar, Sonar, etc)
  • DrawIO
  • WikiJS
  • ddclient

I am think about backing up all data and start from fresh, as this is new hardware that I migrated to. It seems to be a Software issue.

Any other ideas will be greatly appreciated! 

Link to comment

Do you believe is a container issue?

Because if that is the case, I just remove them all and start over. 

I have no problem with that! I have good documentation on everything to re-reproduce the setup again.

Link to comment

OK, taking that into consideration, how can delete them all and all traces of them, without even enabling the Docker?

I don't really care which container is the problem here, I will be installing them again later on, but for now I just want to have a stable system.

Link to comment
2 minutes ago, Chelun said:

OK, taking that into consideration, how can delete them all and all traces of them, without even enabling the Docker?

I don't really care which container is the problem here, I will be installing them again later on, but for now I just want to have a stable system.

If you disable the docker service under Settings->Docker then no containers will be started.

Link to comment
1 minute ago, itimpi said:

If you disable the docker service under Settings->Docker then no containers will be started.

Docker is disabled.

My question was, how to delete all the containers without enabling Docker?

Because it is disabled, I can no see the containers and delete them one by one.

Link to comment
3 hours ago, Chelun said:

Docker is disabled.

My question was, how to delete all the containers without enabling Docker?

Because it is disabled, I can no see the containers and delete them one by one.

You can delete the docker.img file that holds all the container binaries (and is normally in the ‘system’ share).   That will remove all containers.

 

You can delete all the working set files for containers by deleting the contents of the ‘appdata’ share (assuming you used that as the location for them which is the default).

 

Link to comment

A little over 10 days and no crashes since I started in safe mode.

It has been running with all plugins and only 1 docker, ddclient, kept wikiJS in place because I need to migrate data out of it but I deleted all other dockers with data!

Having the system in safe mode meant the plugins were disabled? does it do anything else? how do I get out of safe mode?

Link to comment
14 minutes ago, Chelun said:

Having the system in safe mode meant the plugins were disabled?

Yes.

 

14 minutes ago, Chelun said:

how do I get out of safe mode?

Reboot, you can uninstall all plugins and then re-install one, or a few at a time, and retest.

Link to comment
  • Solution

OK, so far so good.
I rebooted, and deleted all docker containers by the exception of ddclient and wikiJs. Started up by adding the community plugin, waited 2 days and added docker-patch, 2 days later added fixed all things, at this point I am on almost 7 days with all that running and no crashes.

I will go ahead and close this, and as a solution I am going to blame the problem to docker, that after the move to the new hardware, something got corrupted and kept crashing the server.

Thank you all for the help.

  • Like 1
Link to comment

Join the conversation

You can post now and register later. If you have an account, sign in now to post with your account.
Note: Your post will require moderator approval before it will be visible.

Guest
Reply to this topic...

×   Pasted as rich text.   Restore formatting

  Only 75 emoji are allowed.

×   Your link has been automatically embedded.   Display as a link instead

×   Your previous content has been restored.   Clear editor

×   You cannot paste images directly. Upload or insert images from URL.