• [6.11.5] ghettoVCB ESXi backup to array NFS - deletion of old backups creates endless number of .fuse_hidden* files


    murkus
    • Urgent

    The problem I describe here started to appear after updates in Q4 2022. Earlier versions (sorry I do not know the exact version number when this problem appeared) of unraid did not have this problem.

     

    I can 100% reproduce the problem in safe mode (no plugins) and in normal mode.

     

    I have created diagnostics which I make available directly to the unraid developer upon request.

     

    The problem only occurs on the unraid array (this unraid is a bare metal installation). When making ghettoVCB write through NFS to a USB drive (using the Unassigned Devices plugin) the problem does not occur. This is why I assume the problem is somewhere with the unraid fuse driver for the virtual array.

     

    ghettoVCB is a backup script then runs directly on ESXi hosts and creates snapshots of specified VMs to some NFS storage. I have been using unrad for a long to as the target NFS storage for this. Now this doesn't work any more for me. The problem description follows:

     

    The problem occurs when ghettoVCB is cleaning up old backups that ran out of the defined retention. The cleanup code uses the rm command, which will hand in en endless loop creating .fuse_hidden* files.

     

    I am aware what the function of -fuse_hidden* files is and I have checked with lsof whether there is a process keeping any of the files open that ghettoVCB attempts to remove. None of the files is open before ghettoVCB is executed. During executing of ghettoVCB the problem happens so fast that I was unable to check with lsof.

     

    I have added additional debug logging to ghettoVCB to identify the exact code which is having the problem:

    In function checkVMBackupRotation() there is the following command that removes old backups that ran out of the retention:

     

                logger "debug" "Removing $BACKUP_DIR_PATH/$i"
                rm -rf "$BACKUP_DIR_PATH/$i"
                RETVAL=$?

     

    The rm command will spur an endless number of .fuse_hidden* files. unraid seems to think that some process is keeping some file(s) open when it is supposed to be deleted and renames/copies the file to fuse_hidden... And rm seems to iterate until all files are removed, and possibly tries to remove the renames -fuse_idden* file, which generates another .fuse_hidden... file.

     

    This process does not end on its own. This way ghettoVCB cannot be used in the mode where it handles backup retention on its own.

     

    Note that ghettoVCB contains work-around code to work properly with slow NFS NAS devices. This code is -NOT- involved where the problem occurs. I tried to add the work-around right before the problematic command. No amount of sleep time would work around the described problem.

     

    It would be great if this problem gets fixed, as I cannot currently use the parity protection of the unraid array for my ESXi backups - as the backusp only fully work on non-array storage currently.

     

     




    User Feedback

    Recommended Comments

    There are no comments to display.



    Join the conversation

    You can post now and register later. If you have an account, sign in now to post with your account.
    Note: Your post will require moderator approval before it will be visible.

    Guest
    Add a comment...

    ×   Pasted as rich text.   Restore formatting

      Only 75 emoji are allowed.

    ×   Your link has been automatically embedded.   Display as a link instead

    ×   Your previous content has been restored.   Clear editor

    ×   You cannot paste images directly. Upload or insert images from URL.


  • Status Definitions

     

    Open = Under consideration.

     

    Solved = The issue has been resolved.

     

    Solved version = The issue has been resolved in the indicated release version.

     

    Closed = Feedback or opinion better posted on our forum for discussion. Also for reports we cannot reproduce or need more information. In this case just add a comment and we will review it again.

     

    Retest = Please retest in latest release.


    Priority Definitions

     

    Minor = Something not working correctly.

     

    Urgent = Server crash, data loss, or other showstopper.

     

    Annoyance = Doesn't affect functionality but should be fixed.

     

    Other = Announcement or other non-issue.