Skip to content
View in the app

A better way to browse. Learn more.

Unraid

A full-screen app on your home screen with push notifications, badges and more.

To install this app on iOS and iPadOS
  1. Tap the Share icon in Safari
  2. Scroll the menu and tap Add to Home Screen.
  3. Tap Add in the top-right corner.
To install this app on Android
  1. Tap the 3-dot menu (⋮) in the top-right corner of the browser.
  2. Tap Add to Home screen or Install app.
  3. Confirm by tapping Install.

Docker failed to start error, server occasionally crashes

Featured Replies

Hi, my server has occasionally been unresponsive in the past 6 months or so, and it was temporary fixed with a reboot but it would re-occur every month or so. Starting last month, I started having a hard time (required multiple boots with hours of wait in between) to have a functional array and dockers. This week, my server is unresponsive again and I had to reboot, but I am unable to start docker.

 

Previously, my workaround method is:

 

1.)  Start server, pause parity check due to force restart

2.) Immediately stop all containers in dockers. Usually 20mins after pausing all containers, WebUI becomes unresponsive and I cannot Putty into unRAID. It may become responsive again if I wait 30-120mins. Wait 30-120 minutes until system stabilizes.

3.) Once it becomes responsive, then I start all containers.

 

In previous months, I was able to get to step 3, but as of this week. the server does not become responsive after waiting a couple of hours with multiple reboots.

 

I also tried booting with dockers enabled and let it start all containers. It starts all containers fine but will freeze ~10mins after and does not recover.

 

 

In step 2, after stopping all containers, after 15mins or so the cache read spikes and CPU & memory goes to 100% before it becomes unresponsive. 

It does not seem like any particular container is causing the freezing issue as it freezes with docker disabled with fresh boot

 

In the past two days I tried waiting it out overnight after enabling dockers and got the "docker service failed to start" error the next day.

Please see the diagnostics attached, the Feb 3 is the latest diagnostics.

 

Also attached is the Jan 9 diagnostic (enabled the syslog server and tried to capture the crash). Server became unresponsive around 12 noon and it recovered at around 14:30.

 

Note that my cache SSD is crucial MX500, which has know problems, but I have the latest firmware as recommended here for over a year, so I don't think it's causing issues. 

 

Thank you in advance

unraid-diagnostics-20250203-2114.zip unraid-diagnostics-20250109-2029.zip

Solved by JorgeB

  • Community Expert

docker daemon was killed due to the server running out of RAM, there are a lot of containers using considerable RAM, you may want to look in limiting them, or try just having a few running.

  • Author

Hi Jorge, thanks for the quick response!

 

The only way I can enable docker without it crashing is to disable autostart for all containers, then manually start them individually. So far the server has not yet crashed yet.

 

If the server boots with docker disabled, manually enable docker, immediately stop all services still results in a crash, and I get the "Docker service failed to start" error, and also the out of memory error.

 

I will provide an update if it crashes again while I limit my containers. I will also buy more ddr3 RAM, currently running 16gb. 

 

  • Author

The server has become unresponsive again.. I will let it run overnight to see if it becomes responsive tomorrow morning.

I limit my containers so it does not go above 60% memory. However I saw the iowait spiked in Netdata and the CPU is at 100% on dashboard. 

I am starting to think if I should upgrade to new CPU, mobo, RAM and SAS controller all together..

  • Author

The server is now responsive but I get the Docker Service failed to start in the docker tab. 

It crashed around 10pm last night and I see alot of php-fpm error. 

I will fresh boot server and see if plex settings have DLNA enabled. If so, I will disable it and see how it goes.

 

Feb  4 21:48:31 unRAID php-fpm[8792]: [WARNING] [pool www] child 10646 exited on signal 9 (SIGKILL) after 62.877191 seconds from start
Feb  4 21:49:01 unRAID php-fpm[8792]: [WARNING] [pool www] child 15641 exited on signal 9 (SIGKILL) after 29.963122 seconds from start
Feb  4 21:49:38 unRAID php-fpm[8792]: [WARNING] [pool www] child 7604 exited on signal 9 (SIGKILL) after 270.698684 seconds from start
Feb  4 21:49:50 unRAID php-fpm[8792]: [WARNING] [pool www] child 18168 exited on signal 9 (SIGKILL) after 12.032912 seconds from start
Feb  4 21:50:02 unRAID php-fpm[8792]: [WARNING] [pool www] child 18512 exited on signal 9 (SIGKILL) after 11.723276 seconds from start
Feb  4 21:50:34 unRAID webGUI: Successful login user root from 100.100.129.75
Feb  4 21:56:25 unRAID php-fpm[8792]: [WARNING] [pool www] child 21744 exited on signal 9 (SIGKILL) after 349.084727 seconds from start
Feb  4 21:59:51 unRAID php-fpm[8792]: [WARNING] [pool www] child 2495 exited on signal 9 (SIGKILL) after 300.569327 seconds from start
Feb  4 21:59:53 unRAID php-fpm[8792]: [WARNING] [pool www] child 9295 exited on signal 9 (SIGKILL) after 168.833708 seconds from start
Feb  4 21:59:54 unRAID php-fpm[8792]: [WARNING] [pool www] child 12955 exited on signal 9 (SIGKILL) after 77.512407 seconds from start
Feb  4 21:59:58 unRAID php-fpm[8792]: [WARNING] [pool www] child 13046 exited on signal 9 (SIGKILL) after 79.697170 seconds from start
Feb  4 22:00:09 unRAID php-fpm[8792]: [WARNING] [pool www] child 17618 exited on signal 9 (SIGKILL) after 16.257545 seconds from start
Feb  4 22:00:12 unRAID php-fpm[8792]: [WARNING] [pool www] child 17622 exited on signal 9 (SIGKILL) after 17.983878 seconds from start
Feb  4 22:00:16 unRAID php-fpm[8792]: [WARNING] [pool www] child 17628 exited on signal 9 (SIGKILL) after 20.068387 seconds from start
Feb  4 22:00:24 unRAID php-fpm[8792]: [WARNING] [pool www] child 17642 exited on signal 9 (SIGKILL) after 24.160942 seconds from start
Feb  4 22:00:31 unRAID php-fpm[8792]: [WARNING] [pool www] child 17676 exited on signal 9 (SIGKILL) after 21.346093 seconds from start
Feb  4 22:00:35 unRAID php-fpm[8792]: [WARNING] [pool www] child 17683 exited on signal 9 (SIGKILL) after 21.563003 seconds from start
Feb  4 22:00:38 unRAID php-fpm[8792]: [WARNING] [pool www] child 17687 exited on signal 9 (SIGKILL) after 20.247445 seconds from start
Feb  4 22:00:49 unRAID php-fpm[8792]: [WARNING] [pool www] child 17699 exited on signal 9 (SIGKILL) after 23.025699 seconds from start

unraid-diagnostics-20250205-0806.zip

  • Community Expert

Docker was killed again due to OOM:

 

Feb  5 05:41:35 unRAID kernel: Out of memory: Killed process 14888 (dockerd)

 

  • Community Expert
  • Author
11 hours ago, JorgeB said:

Docker was killed again due to OOM:

 

Feb  5 05:41:35 unRAID kernel: Out of memory: Killed process 14888 (dockerd)

 

 

Thanks for reviewing my diagnostics, JorgeB.

 

Server became unresponsive starting at 10Pm Feb 4, and it eventually ran out of memory by 5am. Hence when I check the server the morning, although it has become responsive, docker has failed to start. This reddit thread describes my scenario and the same php-fpm error that I got, server suddenly maxes CPU and becomes unresponsive until a force restart. The solution suggested in the thread is to disable DLNA server in plex settings. However, my DLNA server is already disabled in plex. I disabled DLNA server timeline reporting as well, which was previously enabled.

  • Author
11 hours ago, trurl said:

Thanks for your suggestion. I have deleted my docker.img and reinstalled plex, prowlarr, sonarr and qbitorrent as I think what caused the crash last night was sonarr finished downloading a couple shows and plex started processing them and lead to 100% CPU usage. I will try to see if I can recreate a crash with only these containers.

 

 

  • Author

Update:

After rebuilding the docker image with the previous containers restored, server seems much more stable! I had 6 days of uptime.

Yesterday I mounted an external hard drive with Unassigned Disk Devices, and it was dismounted on its own shortly after. I was unable remount after multiple tries, and today I found my server very slow - almost unresponsive at 6pm. 

I am getting the similar out of memory error, and php-fpm error.

There is a high read in cache drive and some read on flash, I cannot confirm processes is consuming the 100% CPU usage.

 

Feb 11 18:13:53 unRAID php-fpm[9253]: [WARNING] [pool www] child 32653 exited on signal 9 (SIGKILL) after 427.466938 seconds from start
Feb 11 18:13:55 unRAID php-fpm[9253]: [WARNING] [pool www] child 2861 exited on signal 9 (SIGKILL) after 287.056577 seconds from start
Feb 11 18:14:07 unRAID php-fpm[9253]: [WARNING] [pool www] child 10615 exited on signal 9 (SIGKILL) after 13.046237 seconds from start
Feb 11 18:14:08 unRAID php-fpm[9253]: [WARNING] [pool www] child 10694 exited on signal 9 (SIGKILL) after 12.878893 seconds from start
Feb 11 18:14:21 unRAID php-fpm[9253]: [WARNING] [pool www] child 11005 exited on signal 9 (SIGKILL) after 13.019033 seconds from start
Feb 11 18:14:22 unRAID php-fpm[9253]: [WARNING] [pool www] child 11073 exited on signal 9 (SIGKILL) after 13.837164 seconds from start

 

Please see the attached diagnostics and screenshot. Will provide an update when if it becomes unresponsive again.

100 CPU high cache read.png

unraid-diagnostics-20250211-1828.zip

  • Author
17 hours ago, JorgeB said:

Update to v7.0 and try this, but this happening suggests you are getting close to exhausting the server's RAM:

 

https://docs.unraid.net/unraid-os/release-notes/7.0.0/#excessive-flash-drive-activity-slows-the-system-down

 

 

Hi JorgeB,

 

I was hesitant to update to v7.0 as I thought it could bring up additional issues. I will update it to 7.0 and apply the suggested method this weekend. Thanks.

 

  • Author

Feb 16, it crashed. I rebooted the server and backed up appdata.

 

Feb 17, I upgraded to unRAID 7.0.0 without any issues! Updated all plugins and containers, also followed the suggested method linked above and rebooted to use around 500 MB of RAM to ensure the OS files always stay in memory.

 

Feb 18, I was unable to connect to the server while at work, and I confirmed that the webUI is unresponsive, but I was able to Putty into server despite being very slow. Logging in took about 5 mins and I was able to generate diagnostics before I reboot, please see the attached.

 

Reading the syslog, I am getting a bunch of ngix alert, then followed by php-fpm. Any help is much appreciated. 


Feb 18 00:30:14 unRAID nginx: 2025/02/18 00:30:14 [alert] 11587#11587: worker process 17638 exited on signal 6
Feb 18 00:30:14 unRAID nginx: 2025/02/18 00:30:14 [alert] 11587#11587: worker process 2093576 exited on signal 6
Feb 18 00:30:18 unRAID nginx: 2025/02/18 00:30:18 [alert] 11587#11587: worker process 2093577 exited on signal 6
Feb 18 00:30:21 unRAID nginx: 2025/02/18 00:30:21 [alert] 11587#11587: worker process 2093754 exited on signal 6
Feb 18 00:30:21 unRAID nginx: 2025/02/18 00:30:21 [alert] 11587#11587: worker process 2093797 exited on signal 6
Feb 18 00:30:21 unRAID nginx: 2025/02/18 00:30:21 [alert] 11587#11587: worker process 2093798 exited on signal 6
Feb 18 00:30:22 unRAID nginx: 2025/02/18 00:30:22 [alert] 11587#11587: worker process 2093799 exited on signal 6

unraid-diagnostics-20250218-1807.zip

  • Author

After rebooting, my CPU is hovering at 45% but I am unable to see any container using the resource through Netdata.

This is what I see in the terminal with "htop" command. Should I be concerned about the repeated "docker run --log-opt max-size=50m --log-opt max-file=1 --log-level=fatal --storage-driver=btrfs my-container"? Thanks.

 

Htop.PNG

  • Community Expert
  • Solution
Feb 18 18:09:29 unRAID php-fpm[11364]: [WARNING] [pool www] child 2743323 exited on signal 9 (SIGKILL) after 2007.132350 seconds from start
Feb 18 18:09:31 unRAID php-fpm[11364]: [WARNING] [pool www] child 2748316 exited on signal 9 (SIGKILL) after 48.454789 seconds from start
Feb 18 18:09:33 unRAID php-fpm[11364]: [WARNING] [pool www] child 2748318 exited on signal 9 (SIGKILL) after 47.555899 seconds from start
Feb 18 18:09:35 unRAID php-fpm[11364]: [WARNING] [pool www] child 2748319 exited on signal 9 (SIGKILL) after 47.087319 seconds from start
Feb 18 18:09:38 unRAID php-fpm[11364]: [WARNING] [pool www] child 2748365 exited on signal 9 (SIGKILL) after 41.878302 seconds from start
Feb 18 18:09:39 unRAID php-fpm[11364]: [WARNING] [pool www] child 2748375 exited on signal 9 (SIGKILL) after 30.350479 seconds from start

 

In my experience, these errors usually mean that the server is running very low on memory, the GUI will typically still respond, but can be very, very slow.

 

I would recommend adding a little more RAM, or limiting the RAM usage for the current VMs/containers you are using.

  • Author
17 hours ago, JorgeB said:
Feb 18 18:09:29 unRAID php-fpm[11364]: [WARNING] [pool www] child 2743323 exited on signal 9 (SIGKILL) after 2007.132350 seconds from start
Feb 18 18:09:31 unRAID php-fpm[11364]: [WARNING] [pool www] child 2748316 exited on signal 9 (SIGKILL) after 48.454789 seconds from start
Feb 18 18:09:33 unRAID php-fpm[11364]: [WARNING] [pool www] child 2748318 exited on signal 9 (SIGKILL) after 47.555899 seconds from start
Feb 18 18:09:35 unRAID php-fpm[11364]: [WARNING] [pool www] child 2748319 exited on signal 9 (SIGKILL) after 47.087319 seconds from start
Feb 18 18:09:38 unRAID php-fpm[11364]: [WARNING] [pool www] child 2748365 exited on signal 9 (SIGKILL) after 41.878302 seconds from start
Feb 18 18:09:39 unRAID php-fpm[11364]: [WARNING] [pool www] child 2748375 exited on signal 9 (SIGKILL) after 30.350479 seconds from start

 

In my experience, these errors usually mean that the server is running very low on memory, the GUI will typically still respond, but can be very, very slow.

 

I would recommend adding a little more RAM, or limiting the RAM usage for the current VMs/containers you are using.

Hi JorgeB, thanks for the suggestion. I will install additional RAM this weekend to see if that helps. 

  • 1 month later...
  • Author
On 2/19/2025 at 6:32 PM, belupig said:

Hi JorgeB, thanks for the suggestion. I will install additional RAM this weekend to see if that helps. 

Update on this, adding extra RAM is the solution and my server has been up for 42days without issues. Thanks JorgeB!

Join the conversation

You can post now and register later. If you have an account, sign in now to post with your account.
Note: Your post will require moderator approval before it will be visible.

Guest
Reply to this topic...

Account

Navigation

Search

Search

Configure browser push notifications

Chrome (Android)
  1. Tap the lock icon next to the address bar.
  2. Tap Permissions → Notifications.
  3. Adjust your preference.
Chrome (Desktop)
  1. Click the padlock icon in the address bar.
  2. Select Site settings.
  3. Find Notifications and adjust your preference.