Hi,
I noticed that my containers which have GPUs passed through fail to restart after a CA auto-update this morning (15/06/22). One container which didn't stop during the night "appeared" to be working but was not transcoding a queued video. When I restarted this container manually I got the same error below.
The error I get when starting any GPU-passed container is a "Bad parameter" pop-up. When I edit the container config to see the compile error I get this:
docker: Error response from daemon: failed to create shim: OCI runtime create failed: container_linux.go:380: starting container process caused: process_linux.go:545: container init caused: Running hook #0:: error running hook: exit status 1, stdout: , stderr: nvidia-container-cli: initialization error: open failed: /proc/sys/kernel/overflowuid: permission denied: unknown.
I have the nvidia-driver plugin installed and have reinstalled most versions of the drivers, from v470.94 to v515.43.04. nvidia-smi shows my GPUs and the driver version correctly.
I am not sure what the cause of the error is, if its docker, the nvidia plugin, etc.
I noticed this issue occurring since upgrading the unraid OS from 6.9 to 6.10 (which included a nvidia driver update).
I have tried:
a fresh docker.img, with previously configured container templates redownloaded.
restoring the appdata folder from a backup.
Checking GPU usage. Only the containers are set up to use the GPU (I even tried with a single container). I also made sure not to use the OS GUI mode. Currently nothing is using the GPU, according to nvidia-smi.
Downgraded the OS from 6.10.2 to 6.10.1.
Here is the diagnostics file (had to use cmd line as GUI method doesn't do anything).
Any help will be greatly appreciated.
dailynas-diagnostics-20220615-0956.zip