Jump to content
  • Frequent crashes when updating containers


    Dr. Nepharius
    • Annoyance

    I recently replaced my old Node 304 with an i7 7700 with the UGREEN 6800 Pro and installed Unraid on it and it was working like a charm. Since I have multiple backups (tested multiple times and able to go back to a normal state) and was looking forward to test the Arc GPUs, I installed the new Unraid 7 beta. The GPU works without issues and almost everything is working as intended except for 1 situation: updating docker containers. Doing that, more than often, causes my server to “crash” and for the life of me I can’t figure it out why. Before updating to Unraid 7 beta I didn’t have any issues whatsoever.

    I selected a few apps to update and the process starts as normal, but hangs (Image 0031) 

    2) Shortly after it fails, saying that there’s already a container using the same name, which is 100% not the case (Image 0032)

    3) The screen stays like this forever (left like this for quite some time) and after I refresh, it takes forever to load anything on the GUI, independent of the page.

    4) After the main ”Dashboard“ loads, it doesn’t show the docker containers or VMs that are running, despite them running if I type their IP/port (Figure 0033). The only container that it doesn’t work is the one that failed to update.

    In terms of troubleshooting steps I tried:

    Stopping and starting the docker service. It hangs at the stopping portion and it never fully stops

    Stopping and starting the docker service thru CLI. Same issue that the service is never truly stopped

    Kill processes that are related to docker, images and containers and attempting the above. Although I didn’t get any error on killing the processes, the docker service fails to stop/start

    Attempt to reboot and turn-off the server thru the CMD and physical button, for a normal reboot process. It triggers the reboot, but it stops at unmounting something that I can’t quite remember and it never goes past that.

    The only solution that works is to forcefully reboot the server by turning off the power or using a kernel trick. After the reboot the server works as normal and no issues with any containers, although I have the same issue as above when trying to update a container.

    I added the anonymized diagnostics as well.

     

    I appreciate beforehand if anyone can help.

    IMG_0031.jpeg

    IMG_0032.jpeg

    IMG_0033.jpeg

    unraid-diagnostics-20240709-1556.zip




    User Feedback

    Recommended Comments

    Theres a zfs call trace and your docker run on the zfs nvme pool. maybe @JorgeB knows if theres something funky going on.


    Edit: found it!

    Topic here
     

     

    Edited by Mainfrezzer
    Link to comment

    Thanks @Mainfrezzer for chiming in. The behavior looks very similar to what is described by other posts in here and on the linked github issue (https://github.com/openzfs/zfs/issues/16324). From what I can gather, the issue happens for people who are using docker folders for storage (instead of images) and using ZFS for the docker. I’ll try to delete ZFS Master and report back, but it looks like this won’t resolve the issue from prior posts. I will try after switching from directory to image and see if that works.

     

    UPDATE 1:

    - Deleting the Master ZFS plugin doesNOT resolve the issue and it happened again. 

    - I’m still able to force a reboot by using 

    “echo 1 > /proc/sys/kernel/sysrq” followed by

    “echo b > /proc/sysrq-trigger”

     

    UPDATE 2:

    - Switched Docker from directory to BTRFS image and the issue disappeared, which likely confirm the issue described by other nice folks here. 

     

    Edited by gtensolr
    • Like 1
    Link to comment


    Join the conversation

    You can post now and register later. If you have an account, sign in now to post with your account.
    Note: Your post will require moderator approval before it will be visible.

    Guest
    Add a comment...

    ×   Pasted as rich text.   Restore formatting

      Only 75 emoji are allowed.

    ×   Your link has been automatically embedded.   Display as a link instead

    ×   Your previous content has been restored.   Clear editor

    ×   You cannot paste images directly. Upload or insert images from URL.


  • Status Definitions

     

    Open = Under consideration.

     

    Solved = The issue has been resolved.

     

    Solved version = The issue has been resolved in the indicated release version.

     

    Closed = Feedback or opinion better posted on our forum for discussion. Also for reports we cannot reproduce or need more information. In this case just add a comment and we will review it again.

     

    Retest = Please retest in latest release.


    Priority Definitions

     

    Minor = Something not working correctly.

     

    Urgent = Server crash, data loss, or other showstopper.

     

    Annoyance = Doesn't affect functionality but should be fixed.

     

    Other = Announcement or other non-issue.

×
×
  • Create New...