TLDR: I filled the cache drive 100% when transferring files, ran mover and finished the transfer. after this i did a server reboot as a new docker container would not start.
Now none of the containers will start, error OCI runtime create failed (the docker service starts with no errors).
Hi,
I was transferring some large backup files to a backup share on my server from windows using robocopy. The backup share is set to use cache.
I underestimated the size of the files, and this ended with the cache folder getting 100% filled.
While this was transferring in the background, i added a new docker container (adguard-home). This new docker container would not start.
Note, this container was added while there was at least 100GB free space on the cache drive.
I noticed after a while that the cache drive was completely filled, and a few containers had stopped.
I then stopped the transfer, ran mover to free up space on cache, and then resumed the transfer.
when it was done transferring i ran mover again (yes, i know), then i did a server restart since some containers had stopped and wouldnt start. (adguard-home, overseerr and organizr)
After the reboot then none of my containers will start, they all print the same error to log (this is for duckdns):
time="2021-10-27T12:45:08.635963389+02:00" level=error msg="stream copy error: reading from a closed fifo"
time="2021-10-27T12:45:08.635966512+02:00" level=error msg="stream copy error: reading from a closed fifo"
time="2021-10-27T12:45:08.800240194+02:00" level=error msg="04bc6832995f19f166b0652050b9d53ccf603d5bb486a9e111b1ee0e36dee5fa cleanup: failed to delete container from containerd: no such container"
time="2021-10-27T12:45:08.800261780+02:00" level=error msg="Handler for POST /v1.41/containers/duckdns/start returned error: OCI runtime create failed: unable to retrieve OCI runtime error (open /var/run/docker/containerd/daemon/io.containerd.runtime.v2.task/moby/04bc6832995f19f166b0652050b9d53ccf603d5bb486a9e111b1ee0e36dee5fa/log.json: no such file or directory): fork/exec /usr/bin/runc: exec format error: unknown"
I searched google, and found a forum thread for an nvidia hardware container (container using nvidia hardware, idunno how to say it) where they got this error after a driver update. the nvidia driver plugin was set to download latest driver, and a new driver was released recently so i tried downgrading, but no luck.
This is as far as i have come.
I have attached the diagnostics archive, i unfortunately dont have any logs from before i did the first restart (that i know of, maybe they get saved somewhere?)
Forgive any weird english, its not my native language.
Thank you for any input and help
Spirevipp
spiretower-diagnostics-20211027-1254.zip