Jump to content
  • [6.8.3] shfs error results in lost /mnt/user


    JorgeB
    • Minor

    There are several reports in the forums of this shfs error causing /mnt/user to go away:

     

    May 14 14:06:42 Tower shfs: shfs: ../lib/fuse.c:1451: unlink_node: Assertion `node->nlookup > 1' failed.

     

    Rebooting will fix it, until it happens again, I remember seeing at least 5 or 6 different users with the same issue in the last couple of months, it was reported here that it's possibly this issue:

     

    https://github.com/libfuse/libfuse/issues/128

     

    Attached diags from latest occurrence.

     

     

     

    tower-diagnostics-20200514-1444.zip

    • Upvote 4



    User Feedback

    Recommended Comments



    @primeval_god I have installed netdata, but I have never used it and it is completely unknown to me. I'm looking for how to create the configuration you write about, but I can't find anything sensible.

    Can you help? Any link? The netdata documentation is an ocean.

    Link to comment
    17 hours ago, jaclas said:

    @primeval_god I have installed netdata, but I have never used it and it is completely unknown to me. I'm looking for how to create the configuration you write about, but I can't find anything sensible.

    Can you help? Any link? The netdata documentation is an ocean.

    You will need to obtain a copy of the apps_groups.conf file from the netdata container. You can use the docker cp command for this. From the command line, in the appdata folder where you store your netdata configuration, run 

    docker cp netdata:/etc/netdata/orig/apps_groups.conf apps_groups.conf

    assuming your container is named netdata. Add the line 

    shfs: shfs

    to the end of the apps_groups.conf file. Then all you need to do is bind mount the file back into the container at the path /etc/netdata/apps_groups.conf (add the entry to the docker template, make sure both the host and container paths are to the file itself).

    Edited by primeval_god
    Link to comment
    On 5/8/2024 at 2:39 AM, primeval_god said:

    You will need to obtain a copy of the apps_groups.conf file from the netdata container. You can use the docker cp command for this. From the command line, in the appdata folder where you store your netdata configuration, run 

    docker cp netdata:/etc/netdata/orig/apps_groups.conf apps_groups.conf

    assuming your container is named netdata. Add the line 

    shfs: shfs

    to the end of the apps_groups.conf file. Then all you need to do is bind mount the file back into the container at the path /etc/netdata/apps_groups.conf (add the entry to the docker template, make sure both the host and container paths are to the file itself).

     

    Thx, I did it, restarted the container.... and now where to look for this value in netdata webgui?

     

    Link to comment
    5 hours ago, jaclas said:

     

    Thx, I did it, restarted the container.... and now where to look for this value in netdata webgui?

     

    Under the Applications section, the graphs should now have a shfs line. For this specific issue the subsection apps.files will give you the number of open files for the shfs process.

    Link to comment
    55 minutes ago, primeval_god said:

    Under the Applications section, the graphs should now have a shfs line. For this specific issue the subsection apps.files will give you the number of open files for the shfs process.

    End of the config file:

     

    image.png.13b01ded230b61e91abe637a97f54e3a.png

     

    and I can't find the metrics in netdata gui:

     

    image.thumb.png.ccb659d4a726fa16fdcbdedcaf710268.png

     

     

    I'm probably not looking where I need to. Please show a screenshot if you have it configured.

     

    ps. in fact I think I have a way of triggering this error, I need to check it a few more times and I will publish it

     

    Link to comment
    3 hours ago, jaclas said:

    I'm probably not looking where I need to. Please show a screenshot if you have it configured.

    If you select the "file descriptor limit" graph labeled fds in the side bar, then one of the dimensions on the graph should be shfs. I use an older version of the ui so i just annotated your screen shot with where to look.

    image.thumb.png.9787e8ada43857d70a629674e6b6bf7c.png

     

    Link to comment

    @primeval_god

     

    Unfortunately, there is no this item, here they are all on two screenshots:

     

    image.thumb.png.1d970922a5543c9287837bc0dac4f447.png

     

    image.thumb.png.819bc682f5c3136db6ca759d5413dbbd.png

     

    There's another one at the end: VMs, and that's it. 
    Perhaps this configuration from the apps_groups.conf file was not loaded?

     

    In fact, I didn't strictly do what you suggested, i.e. mount the file in docker, because docker already has this whole directory mapped to a directory in Unraid: 
     

    image.thumb.png.e9adf36a5c21b73b350a5f0142784d18.png

     

    So I just copied it there from inside the container and added what you pointed out. But that's probably not the problem.

     

    Link to comment
    6 hours ago, jaclas said:

    So I just copied it there from inside the container and added what you pointed out. But that's probably not the problem.

    That shouldnt be a problem. Perhaps its a permissions issue? Do the permissions on the apps_groups.conf file match those of the other files in /etc/netdata? I assume you restarted the container after adding the config file, but if not that should be done as well.

    Link to comment
    On 5/10/2024 at 3:21 PM, primeval_god said:

    That shouldnt be a problem. Perhaps its a permissions issue? Do the permissions on the apps_groups.conf file match those of the other files in /etc/netdata? I assume you restarted the container after adding the config file, but if not that should be done as well.

     

    Eh, silly me.... I got the file name wrong, I had apps_group instead of apps_groups 🙂

    Now it works, by the way I show a screenshot with a graph of the number of open files (all of them) on which you can see four big jumps. The jumps are just happening in the shfs area (lower, yellow) when I start compiling the application in node. The number of open descriptors in total reaches 18k, and the ulimit is set to about 40k, so there was no error this time. But I just happen to have most of the services disabled on the server, and just for a test I wanted to see if netdata would respond correctly. I will do tests under higher load soon.

     

    image.thumb.png.652724f5d9ce3d6cec83a0a1cbe99ad7.png

    Edited by jaclas
    Link to comment

    I have just encountered this. Caused my Plex installation to completely wipe all of the media out of its database.

    Link to comment



    Join the conversation

    You can post now and register later. If you have an account, sign in now to post with your account.
    Note: Your post will require moderator approval before it will be visible.

    Guest
    Add a comment...

    ×   Pasted as rich text.   Restore formatting

      Only 75 emoji are allowed.

    ×   Your link has been automatically embedded.   Display as a link instead

    ×   Your previous content has been restored.   Clear editor

    ×   You cannot paste images directly. Upload or insert images from URL.


  • Status Definitions

     

    Open = Under consideration.

     

    Solved = The issue has been resolved.

     

    Solved version = The issue has been resolved in the indicated release version.

     

    Closed = Feedback or opinion better posted on our forum for discussion. Also for reports we cannot reproduce or need more information. In this case just add a comment and we will review it again.

     

    Retest = Please retest in latest release.


    Priority Definitions

     

    Minor = Something not working correctly.

     

    Urgent = Server crash, data loss, or other showstopper.

     

    Annoyance = Doesn't affect functionality but should be fixed.

     

    Other = Announcement or other non-issue.

×
×
  • Create New...