• [6.12.0/6.12.1/6.12.2] - DOCKER - Dockers can't start (after 3-4 days running)


    Pducharme
    • Solved Urgent

    Hi, I would like to report a Bug with the Official 6.12.0 / 6.12.1 version.

     

    SInce the 6.12.x beta + rc, it seems that there is some kind of memory leak or similar, that make Docker not working properly until a full Reboot occurs.   The issue can be reproduce easily :

     

    1. Boot Unraid 6.12

    2. Start all your dockers 

    3. Wait (randomly 3-4-5 days, never same amount of time)

    4. Update a docker, and it will not be able to start, throwing a "Execution Error" / "Server Error".

     

    Once you start getting this error, you won't be able to restart ANY dockers you have running (it will throw the same error).

    You can't even disable Docker, then Re-enable Docker, it just fail.   

    A full server reboot will fix it for 3-4-5 more days, then it will come back.

     

    Another observation, I'm running "homepage" docker, and it sees the "health" of "PlexMediaServer" docker.  When the bug isn't triggered yet (during the first days after a reboot), it reports the health of PlexMediaServer to be "healthy", but once the bug is happening, even without restarting a docker, the homepage docker report "Unhealthy" for Plex.  At that point, I know I need to reboot.  I confirmed by trying restarting any docker, and got the "Execution Error" every single time that the Plex is shown as "unhealthy".

     

    Hopefully, this bug report will get some traction.  If you haven't seen it yet, I suppose it's a question of days before it happens.

     

    I see that i'm not alone with this bug (see :  https://forums.unraid.net/topic/140301-unraid-612-rc7-docker-stops-after-update/#comment-1274038 )

    • Like 1
    • Upvote 1



    User Feedback

    Recommended Comments



    Hi bonienl, 

     

    I'll add my diagnostic when it occurs next time.  Until then, here is the one from the original thread from someone who has the same issue.  

     

    As for the space, I can confirm i'm not running out of space anywhere (logs, docker, array, cache, etc.).

    sekiro-diagnostics-20230611-2208.zip

    Link to comment
    17 hours ago, Kira said:

    Hi,

     

    I am also facing this issue but it was more frequent like only running them after 24 hours then I have to force reboot my server.

     

    I suspect it's docker issue cause I do not have any VMs at all

     

    lserver01-diagnostics-20230623-0639.zip

     

    You have a macvlan related call trace. Change Docker to use ipvlan custom networks and retest.

     

    Link to comment
    13 hours ago, bonienl said:

     

    You have a macvlan related call trace. Change Docker to use ipvlan custom networks and retest.

     

     

    how do I fix the macvlan call trace? cant seems to find a solution other than switching to ipvlan

     

    however I was using macvlan all the while. I do not want to use ipvlan cause it does not shows the dockers in my UDM-Pro client list

    Link to comment
    19 hours ago, Kira said:

     

    how do I fix the macvlan call trace? cant seems to find a solution other than switching to ipvlan

     

    however I was using macvlan all the while. I do not want to use ipvlan cause it does not shows the dockers in my UDM-Pro client list

     

    Update: 

    Downgraded back to 6.11.5 and no issue after 24 hours so far

     

    After further reading on the whole changelog and other parts of the forum, my issue seems to be most likely cause by that macvlan call trace 

     

    solution is to either go ipvlan or if macvlan need a dedicated docker network port

     

    hmm, I wonder if macvlan call trace will be fix in a future update

    • Thanks 1
    Link to comment

    Hi,

     

    Got this issue since 6.12.0.

     

    Not all dockers are started in my case, after 2-3 days, if I update a docker, it can't start anymore. During those 2-3 days, docker update works fine (can update a docker 2-3 times before the error occurs).

    I need to restart Docker service (no need to reboot the server) to get it back.

     

    Custom network type: ipvlan

     

    Diagnostics attached to the post.

     

    error.png.a551f542f4377189bad0e15240bda16d.png

     

    tower-diagnostics-20230626-1015.zip

    Edited by Peuuuur Noel
    Link to comment
    On 6/27/2023 at 9:42 AM, carnyc said:

    Hi,

    I have the same problem.
    My Docker instance just crashes after about a day and then I have to restart the server.

    Regards

    tower-diagnostics-20230627-0909.zip

     

    Ok, a downgrade to version 6.11.5 solved the problem for me. Docker is now 100% stable and reliable again. Before the next upgrade I will wait for a stable release.

    Link to comment
    47 minutes ago, carnyc said:

    I will wait for a stable release.

    Try the just released v6.12.2, docker was downgraded to see if it helps with this issue some users are having.

    Link to comment

    Issue still there on 6.12.2 for me.

    On 6.12.1, I could restart Docker service to get it back. But now a need to reboot the server because of:

    '/mnt/user/system/docker/docker.img' is in-use, cannot mount

     

    Also tested with a new docker.img (in case of corrupted img file) but problem persist.

     

    tower-diagnostics-20230703-0001.zip

    Link to comment

    As in this post, I'm using a VPN container (not the same) where 2 other containers use it with this extra parameter "--network=container:binhex-privoxyvpn" and docker with ipvlan.

    Link to comment
    1 hour ago, Peuuuur Noel said:

    As in this post, I'm using a VPN container (not the same) where 2 other containers use it with this extra parameter "--network=container:binhex-privoxyvpn" and docker with ipvlan.

    Is it possible for you to test without that container running to see if you stop seeing the issues?

    Link to comment
    2 hours ago, JorgeB said:

    Is it possible for you to test without that container running to see if you stop seeing the issues?

    Yes, testing that, will see in a few days.

    • Like 1
    Link to comment

    Same problem here, after the update to 6.12.2, my docker containers suddenly stopped to restart properly, I created a post for that, I did a revert to 6.11.5, no more worries. 

    Even if the problem could be the macVlan, I find that a bit limiting, it's very clearly a problem following the update.

    Given the number of people reporting the problem, I'd call switching to another way of network is a band-aid.

    I'm going to wait for a few releases before making the update :)
     

     

    • Like 1
    Link to comment

    I updated to 6.12.2 about 2 days ago and I just got this issue again.  Attached the diagnostic file.  The error with the "Execution Error", and i also tried restarting a docker that was running well and now, It can't start (it gives the error).  I will have to do a full server reboot.

     

    unraid-diagnostics-20230703-1950.zip

    Link to comment

    I got the same issue with the stopped dockers as the users above my post, had to revert back to 6.11.5 to get it working again. I did the revert before reading here so I have no log to attach :(

    Link to comment

    I updated to 6.12.2 from 6.11.5 two days ago, and I just had the same issue. "Execution error" when trying to start a container. Restarting docker didn't work, CPU started to go nuts, had to reboot the server. If it's related to containers with special network settings as above I would suspect "binhex-delugevpn". I'm not running that much, no VMs.

    micro-8-diagnostics-20230705-0023.zip

    Link to comment
    On 7/3/2023 at 2:22 PM, JorgeB said:

    Is it possible for you to test without that container running to see if you stop seeing the issues?

     

    Finally found what was causing this error on my side. Don't know if it's the same as the post author.

    This was not caused by a VPN container.

     

    Got an Nvidia GC used by Plex container for transcoding. This container is logging every 5 sec data related to Nvidia drivers in "/run/docker/containerd/daemon/io.containerd.runtime.v2.task/moby/5d096a1b1c8faeedfdcbd60dd63f971b63a6fda9ebdec9ffecd5407fa6781c85" (same as this post), filling up the 32MB of "/run" and ending by an out of space error when trying to start/restart a container.

     

    Error response from daemon: failed to start shim: symlink /var/lib/docker/containerd/daemon/io.containerd.runtime.v2.task/moby/5d096a1b1c8faeedfdcbd60dd63f971b63a6fda9ebdec9ffecd5407fa6781c85 /var/run/docker/containerd/daemon/io.containerd.runtime.v2.task/moby/5d096a1b1c8faeedfdcbd60dd63f971b63a6fda9ebdec9ffecd5407fa6781c85/work: no space left on device: unknown
    
    Error: failed to start containers: 5d096a1b1c8faeedfdcbd60dd63f971b63a6fda9ebdec9ffecd5407fa6781c85

     

    After deleting this log file to free space, I was able to start container as usual.

    I used the "--no-healthcheck" (thanks to Mstein999) on Plex container to stop this log file from filling up.

     

    Was there a change in 6.12 that could cause this amount of logs?

    Link to comment
    4 minutes ago, Peuuuur Noel said:

    Was there a change in 6.12 that could cause this amount of logs?

    No sure, I think not directly at least, but you're no the first to have this issue, with /run filling up, and I assume you are using 6.12.2? For this release docker was reverted to the same major release as the one used with v6.11.5.

    Link to comment
    10 minutes ago, JorgeB said:

    No sure, I think not directly at least, but you're no the first to have this issue, with /run filling up, and I assume you are using 6.12.2? For this release docker was reverted to the same major release as the one used with v6.11.5.

     

    Yes, using 6.12.2

    Link to comment



    Join the conversation

    You can post now and register later. If you have an account, sign in now to post with your account.
    Note: Your post will require moderator approval before it will be visible.

    Guest
    Add a comment...

    ×   Pasted as rich text.   Restore formatting

      Only 75 emoji are allowed.

    ×   Your link has been automatically embedded.   Display as a link instead

    ×   Your previous content has been restored.   Clear editor

    ×   You cannot paste images directly. Upload or insert images from URL.


  • Status Definitions

     

    Open = Under consideration.

     

    Solved = The issue has been resolved.

     

    Solved version = The issue has been resolved in the indicated release version.

     

    Closed = Feedback or opinion better posted on our forum for discussion. Also for reports we cannot reproduce or need more information. In this case just add a comment and we will review it again.

     

    Retest = Please retest in latest release.


    Priority Definitions

     

    Minor = Something not working correctly.

     

    Urgent = Server crash, data loss, or other showstopper.

     

    Annoyance = Doesn't affect functionality but should be fixed.

     

    Other = Announcement or other non-issue.