Docker Service failed to start.

DannyG · October 8, 2020

So as the title states, my Docker wont start.

the docker logs show this:

panic: runtime error: index out of range

goroutine 1 [running]:
github.com/docker/docker/vendor/github.com/docker/libnetwork.(*controller).reservePools(0xc000afa300)
/go/src/github.com/docker/docker/vendor/github.com/docker/libnetwork/controller.go:955 +0xd80

I've been around the block long enough to know to take diagnostic before I rebooted my server.
(reboot didn't fix this btw)

So here are both my diag logs.
Before and after.

I haven't done any recent changes, everything's been stable.

tower-diagnostics-20201007-1839.zip tower-diagnostics-20201007-2014.zip

DannyG · October 8, 2020

any help would be appreciated.

DannyG · October 8, 2020

so I'm reading my logs

it seems to be complaining about Network.

and it dawned on me that I lied earlier, (about the recent changes)

I followed this link https://www.target-bravo.com/blog/2018/7/10/blog-headline-1-6y3tj-49m5s-7tbal-56sar

and added a certificate to my unraid server so I can access it on my local lan with a local DNS name without chrome complaining about the cert.

DannyG · October 9, 2020

bump!
I need my docker! I have kids at home and plex is down... I'm dying here

DannyG · October 13, 2020

I'm not sure why no one is replying....

but I rebuild my docker image on friday night. this got my docker online again.
I rebuilt the same 60GB image (which I large I think)

now we're Monday, (3 days later) and my Docker is experiencing the same issue.
It's showing "version: not available" which is what happen before, and after I rebooted the docker service didn't want to start.

I've uploaded my latest diagnostic files.

tower-diagnostics-20201012-2016.zip

trurl · October 13, 2020

Why do you have 60G allocated to docker.img? Have you had problems filling it? 20G is usually more than enough. I have 17 containers and they are using less than half of 20G docker.img.

Also, your system share has files on the array, possibly that docker.img

Your syslog is spammed with these messages, any idea what that is about?

Oct 10 04:42:12 Tower kernel: br0: topology change detected, propagating
Oct 10 04:42:16 Tower kernel: br0: port 1(eth1) received tcn bpdu

DannyG · October 13, 2020

My original docker image filled up once, so I made it larger. I have the available space, so I didn't think tripling my docker image to 60Gb would be an issue.

could me adding a cert and forcing SSL cause that networking issue? I don't think so... I might have removed an interface, thinking I was cleaning things up.

I had to google what "received tcn bpdu" was.. and I still have no idea.

trurl · October 13, 2020

18 minutes ago, DannyG said:

My original docker image filled up once

Did you figure out why and fix it?

DannyG · October 13, 2020

no, that. was a long time ago. I just assumed I was installing too many apps.

trurl · October 13, 2020

The usual reason for filling docker.img is an app writing to a path that isn't mapped.

Docker.img is mounted but not clear if it has corruption since no useful info in syslog due to all that other stuff.

How many containers do you have?

DannyG · October 13, 2020

around 15

trurl · October 13, 2020

13 hours ago, DannyG said:

around 15

Pretty sure 60G should not be needed for that many. As I said:

14 hours ago, trurl said:

I have 17 containers and they are using less than half of 20G docker.img.

Making it larger won't fix anything, it will just make it take longer to fill.

Do you understand what I mean by this?

13 hours ago, trurl said:

The usual reason for filling docker.img is an app writing to a path that isn't mapped.

Linux is case-sensitive, for example, so if an application is writing to a path that doesn't match a container path re: upper/lower case then it is writing into docker.img

DannyG · October 13, 2020

yes I get it.
how do I fix this now?

trurl · October 13, 2020

Delete docker.img, change size to 20G, recreate, reinstall your dockers using the Previous Apps feature on the Apps page.

DannyG · October 14, 2020

Alright, I tried recreated my docker image, and I'm getting the following errors:
Warning: Invalid argument supplied for foreach() in /usr/local/emhttp/plugins/dynamix/include/DefaultPageLayout.php(521) : eval()'d code on line 62

Warning: Invalid argument supplied for foreach() in /usr/local/emhttp/plugins/dynamix/include/DefaultPageLayout.php(521) : eval()'d code on line 572

Warning: Invalid argument supplied for foreach() in /usr/local/emhttp/plugins/dynamix/include/DefaultPageLayout.php(521) : eval()'d code on line 608

Squid · October 14, 2020

Top of my head, I'd say those errors are from custom networks you had defined. What I would do is stop the service, delete the image again, apply. Reboot then try it again.

DannyG · October 14, 2020

there is no image currently.. it's deleted.
I'm seeing this before I can even create an image.

Edited October 14, 2020 by DannyG
newb

DannyG · October 14, 2020

ok, so error 62 is always there, regardless if I have a docker image or not.
when I recreate the image, error 572 and 608 appear... but the image still creates.
I tried deleting it, and re-creating it yet again, but still... same results.

DannyG · October 14, 2020

the only clue I have is network related... and after taking a closer look, I thought I had fixed my bonded link, but it appears that it's not enabled.
I'm going to look and possibly rebuild my networking on my unraid server and verify my port configs on my cisco switch.
once that's done, I'll rebuild my docker image and report back.

Please let me know what you think of this path.

DannyG · October 20, 2020

I should probably start a new thread because I have a new question now, but I delete the docker image and also reconfigured my network from scratch.

I deleted the network.config file on my USB, and when reconfiguring my network configs it appears that I might have found a good clue to the "network spamming"

see, I had 2 NICs setup as a "trunk" on my switch and unraid server. but it appears that eth0 no longer has a MAC address
my 2 NIC are showing up at Eth1 and Eth3.

I reconfigure the network simple this time, and just left it as a single network connection. (no bonding)

I reinstalled all my dock apps.

The New problem is still network related, most of my apps aren't getting any IPs.
I have my home automation hub and wifi controller configured on br1 so they are getting a native IP address. but many of my other apps (like plex) aren't getting a bridge IP.
.
How do I reset the network configs for these apps? I don't want to delete everything... I just want to reset the network configs (I'm assuming it's remembering some old configs)

All my VMs were like that too... couldn't;'t start because missing network X, but they were all easily fixed (just had to select br1)

Edited October 20, 2020 by DannyG

trurl · October 20, 2020

If you had custom docker networks you have to recreate them after recreating docker.img, then you can select the custom network when you edit the container.

DannyG · October 20, 2020

I only had 1, for ombi.
I haven't re created it yet.

why would none of my dockers get an IP just because 1 of my dockers had a custom network that's currently missing?

DannyG · October 21, 2020

Soo.. everytime I reboot my server now, my network configs don't stick. they revert back to a non working state.
Im going to pull the USB stick out again and verify if the network.config file was recreated (I assumed it would have after I renamed the original file to .bkp

However, even before rebooting, I recreated my one and only custom network that I had created for ombi.
not only did ombi not get an IP after that, non of the other dockers did either.

If I watch the unraid server boot up with a monitor plugged in, it keep complaining that ETH0 doesn't' exist.

DannyG · October 22, 2020

1) So i reverted back to my backup. (I put my network.cfg back to how it was)

This "gave me back" br0, which was gone after deleting that file, and was now br1.

it looks like my Docker apps hate that, and want to be on the exactly same custom network as it was before.

Where can I find these configs?

2) I edited the network-rules.cfg file, and I change eth1 back to eth0 and then eth3 back to eth1. THis was the root cause of my unraid server shitting the bed. why it happen I'm still not sure.

most of my apps are getting there own 172. IP address (not shown in screenshot) but some other still aren't.
how can I manually edit the network configs for these files???

Docker Service failed to start.

Recommended Posts

Link to comment

Link to comment

Link to comment

Link to comment

Link to comment

Link to comment

Link to comment

Link to comment

Link to comment

Link to comment

Link to comment

Link to comment

Link to comment

Link to comment

Link to comment

Link to comment

Link to comment

Link to comment

Link to comment

Link to comment

Link to comment

Link to comment

Link to comment

Link to comment

Join the conversation