[6.12.2-3][Unsolved][Upgrade issue] Docker service fails to start in 6.12.x


Recommended Posts

Hello,

 

My docker service is not starting and is not returning much when it fails. It tries to start, acts like it's going to start, but then stops without much in the error messages. Looks like it may be an issue with `iptables -t nat` based on the logs I could find but I am unsure how to overcome this. 

 

Server is a Dell R510 with 2*X5670 CPU, 128G RAM, and a DAS. It is not overclocked. Not much is publicly accessible.

 

I am running UnRaid 6.12.3. The problem started when I upgraded to 6.12.2  from 6.11.5. On the previous version the docker service ran many containers but fails to start through the upgrades. 

Symptoms:

1) On the Docker web GUI tab the message "Docker Service failed to start." is displayed in a yellow box without further errors.

2) Syslog says:

Quote

Jul 17 16:57:51 Asgard kernel: BTRFS info (device loop2): enabling ssd optimizations

Jul 17 16:57:51 Asgard root: Resize device id 1 (/dev/loop2) from 25.00GiB to max

Jul 17 16:57:51 Asgard emhttpd: shcmd (175): /etc/rc.d/rc.docker start

Jul 17 16:57:51 Asgard root: starting dockerd ...

Jul 17 16:57:51 Asgard avahi-daemon[7186]: Server startup complete. Host name is Asgard.local. Local service cookie is 113882372.

Jul 17 16:57:52 Asgard avahi-daemon[7186]: Service "Asgard" (/services/ssh.service) successfully established.

Jul 17 16:57:52 Asgard avahi-daemon[7186]: Service "Asgard" (/services/smb.service) successfully established.

Jul 17 16:57:52 Asgard avahi-daemon[7186]: Service "Asgard" (/services/sftp-ssh.service) successfully established.

Jul 17 16:58:24 Asgard emhttpd: shcmd (178): umount /var/lib/docker

3) Attempting a CLI start returns

Quote

:~# /etc/rc.d/rc.docker start
no image mounted at /var/lib/docker

4) /var/log/docker.log shows several identical messages

Quote

:~# cat /var/log/docker.log 
time="2023-07-17T16:57:54-07:00" level=warning msg="containerd config version `1` has been deprecated and will be removed in containerd v2.0, please switch to version `2`, see https://github.com/containerd/containerd/blob/main/docs/PLUGINS.md#version-header"
failed to start daemon: Error initializing network controller: error obtaining controller instance: failed to create NAT chain DOCKER: iptables failed: iptables --wait -t nat -N DOCKER: iptables v1.8.9 (legacy): can't initialize iptables table `nat': Table does not exist (do you need to insmod?)
Perhaps iptables or your kernel needs to be upgraded.
 (exit status 3)

5) Running `iptables --wait -t nat -N DOCKER` does indeed fail with the error message above

Quote

iptables v1.8.9 (legacy): can't initialize iptables table `nat': Table does not exist (do you need to insmod?)
Perhaps iptables or your kernel needs to be upgraded.

 

 

Attempted fixes:

1) Removed docker.img and let the system recreate it.

2) Tried several GUI docker setting options like macvlan and ipvlan, not preserving user defined networks, a larger vdisk size, btrfs vs directory, and a 5 minutes timeout just so it may get over a hang. 

3) Cleaned out my go file of things not needed due to various upgrades and plugins.

4) Rebooted.

5) Installed the newer update and rebooted.

6) Manually ran: `iptables --wait -N DOCKER` and attempted to start the docker service with cli /etc/rc.d/rc.docker start

 

However no matter what I attempt I am unable to make a difference in the symptom. 

 

Please advise,

 

asgard-diagnostics-20230717-1444.zip

Edited by cYnIx
Link to comment
  • cYnIx changed the title to v6.12.3 Docker service fails to start since upgrade

Problem appear network related:


 

failed to start daemon: Error initializing network controller: error obtaining controller instance: failed to create NAT chain

 

Don't remember seeing this particular error before, but try resetting network settings to default, and delete any custom docker networks.

Link to comment

Hello,

 

I am remotely managing the server so messing with the network settings are worrisome but here we go. 

 

Previously in attempted fix #2 I documented how I attempted to not preserve user defined networks with the GUI option in Settings > Docker. For this attempt I will try to delete any custom docker networks with the cli:

Quote

:~# docker network rm mybridge
Cannot connect to the Docker daemon at unix:///var/run/docker.sock. Is the docker daemon running?

But as you can see I was unable to proceed as the docker service is not starting. 

 

I reset my network to default by moving /boot/custom/network.cfg to network.cfg.bak and rebooted the server.

 

Unfortunately the problem persists. 

Screenshot from 2023-07-18 12-00-21.png

Link to comment
  • cYnIx changed the title to [6.12.2-3][Unsolved][Upgrade issue] Docker service fails to start since upgrade

Hello,

 

As an update, I remotely was able to downgrade from 6.12.3 multiple versions by using wget to download the version that last worked for me.

2) I unzipped the package in a directory /root/previous/ and removed the files not found in /boot/previous/

3) I then moved /boot/previous/ to /boot/v6.12.2/ 

4) Then moved /root/previous/ to /boot/previous/ 

5) With all files in place I could now navigate to Unraids Tools > Update OS and restore the previous version.

When the server rebooted to the last working version low and behold the docker service started. I stopped it and restored my backup docker.img. Restarting docker brought back all my containers and everything was working. 

 

This result means I am now convinced that there are some serious bugs with 6.12.0 through the current 6.12.3.

 

My current stance is to avoid this update series until something changes. 

Edited by cYnIx
Link to comment
  • cYnIx changed the title to [6.12.2-3][Unsolved][Upgrade issue] Docker service fails to start in 6.12.x
  • 7 months later...

It's been a few months and I saw that 6.12 has progressed to minor version .8 and I had some time so thought I would give the update another go. 

 

To hedge my bets I moved any containers using a user created docker network off of its custom working network and onto "none" then removed the user network via cli 'docker network rm my-bridge' using 'docker network ls' revealed it worked and there are no remaining custom networks.

I also unchecked preserve user networks in the config.

 

Of course this broke my public dockers. All the other docker containers were set to bridge.

 

Testing the 12.8 update AND . . . Same problem. . .

 

I fiddled around with ipvlan and macvlan settings. IPv4 custom network on interface br0 settings. Nothing would get me to starting the docker service. 

 

Everything works pretty smooth on version 6.11.5 with iptables v1.8.8 and with all dockers and plugins up to date (except CA), upgrade to 6.12.8 with iptables v1.8.9 then the message docker service fails to start with the same log messages above, reverting to 6.11.5 and remaking the network so it works again. The saga continues.

 

Gotta be honest, I'm feeling a bit let down here with community applications blocking app changes and the updates not working.

Link to comment

Join the conversation

You can post now and register later. If you have an account, sign in now to post with your account.
Note: Your post will require moderator approval before it will be visible.

Guest
Reply to this topic...

×   Pasted as rich text.   Restore formatting

  Only 75 emoji are allowed.

×   Your link has been automatically embedded.   Display as a link instead

×   Your previous content has been restored.   Clear editor

×   You cannot paste images directly. Upload or insert images from URL.