• [6.9.0] Shares disappeared and an unclean shutdown through GUI


    eagle470
    • Urgent

    Hi, Went to turn on the TV for my kid this morning and no plex, go check my shares and they aren't there. Pulled Diagnositics and did a reboot through the button in the upper right hand corner (Dark Theme) and when the system came back up it whined about an unclean shutdown. Pulled a second set up diags. 

     

    I had not touched the array since yesterday morning.

     

    This is not the first time share dropped out from under me on a version of 6.9, but other people had reported it, so I left it alone at the time.

    This occurred while I was out of town for the evening and unable to check the status or fix the array. I travel for work so I'm marking this as urgent because this a huge problem for me and my wife when I start traveling again. (need that tube to distract the 2 year old while my wife is busy with the new born.)

    On a related note: Can you do a post on what you consider minor, urgent annoyance or other for severity? 

    unraid1-diagnostics-20210305-0941.zip unraid1-diagnostics-20210305-0932.zip

    • Like 1



    User Feedback

    Recommended Comments

    IMO if you want stability, roll back to 6.8.3 if you can, it works beautifully if you don't need the new multiple cache pools feature. It's what I did after a few bugs and instability under 6.9.

    You'll just need to recreate the cache since the setup files for the cache have been moved between 6.8.3 and 6.9.

    Link to comment
    39 minutes ago, SomeRandomSod said:

    IMO if you want stability, roll back to 6.8.3 if you can, it works beautifully if you don't need the new multiple cache pools feature. It's what I did after a few bugs and instability under 6.9.

    You'll just need to recreate the cache since the setup files for the cache have been moved between 6.8.3 and 6.9.

    I already implemented two other pools, on for my dockers, one dedicated to downloads and a thirds for my video surveillance drive

    Link to comment

    Yes user share file system is crashing:

    Mar  4 11:00:12 Unraid1 shfs: shfs: ../lib/fuse.c:1451: unlink_node: Assertion `node->nlookup > 1' failed.

    I've been looking at this for a while and don't see an obvious culprit.  There is a whole lot of 'noise' in the system log because there are quite a number of plugins installed.  Can you get rid of some of those?  A danger with plugins is that they can modify the base OS and even install down-rev versions of packages.  I'm not saying this is the case but it makes analyzing a lot more difficult.

    Link to comment

    I have the same problem after the upgrade to 6.9.0 - no shares and homes anymore. Downgrading to 6.8.3 works after re-assigining the SSD as a Cache drive.

    Link to comment

    Same issue here. Twice now the shares are gone, and "shfs: shfs: ../lib/fuse.c:1451: unlink_node: Assertion `node->nlookup > 1' failed." in the log. Rolling back to 6.9.0-beta30 which actually has been stable for me.

    Link to comment
    25 minutes ago, Ancan said:

    Right! Sorry, forgot the attachment.

     

    Thanks!  As a test, if you can run without NFS, please try this: disable NFS and see if issue persists.

    Link to comment
    32 minutes ago, Ancan said:

    Right! Sorry, forgot the attachment.

    Also as a different test: if disabling NFS you hit same issue, try this: go to Settings/Global Share Settings and set Tunable (hard link support) to No, stop array, and then restart array and see if issue persists.

    Link to comment

    I'll be away for a week or two now, and like to keep the server as stable as possible meanwhile. Sorry if I can't be of more help. Pretty dependent on NFS also, since almost all of my VM's have mounted exports from the array.

     

    Link to comment

    I'm back and have upgraded to 6.9.1 again (Reboot button did hard non-clean reboot BTW). I'll keep and eye on it, but can't disable NFS unfortunately since I'd have no use of it then.

    Link to comment
    On 3/11/2021 at 11:07 PM, limetech said:

    Also as a different test: if disabling NFS you hit same issue, try this: go to Settings/Global Share Settings and set Tunable (hard link support) to No, stop array, and then restart array and see if issue persists.

     

    Got some time for tinkering today, and after a few stable weeks on the beta, tried updating again. The problem actually seemed to have *worsened* now for some strange reason, since it started acting up just a couple of minutes after boot instead of days.

     

    I do however I suspect I might have found the culprit for this on my particular server. After replacing the flash device to make sure it hadn't gone bad, turning off the "tunable" in the Global Share Settings, and moving on to 6.9.2 I still had the server failing soon after boot, with "flash device error", no VM's/Dockers, lost shares et.c.

     

    Checking the syslogs I'd collected I saw that the first errors in them seems to come from the eth1 NIC, which happens to be an USB3 "Type-C" NIC that I use in LACP bond with the integrated GbE. This has worked just fine before 6.9.1, but as a test I unplugged it, and the server has now been up an hour without problems. Looks like it takes the whole USB subsystem with it when it breaks, hence the inaccessible flash device.

     

    The NIC is a Realtek "RTL8153", and some research shows that the Realtek-provided driver only works properly up to kernel 5.6, but that it will be supported again in 5.13. (https://www.phoronix.com/scan.php?page=news_item&px=Realtek-RTL8153-RTL8156-Linux).

    Edit: Some more googling and not sure about the driver state on the different kernels anymore. Some say it's been supported already for some time. Still seems to be what caused my problems, so I'll keep it unplugged meanwhile.

     

     

     

    Edit #2: Might have celebrated too early. Shares disappeared again,  lots of "Transport endpoint is not connected", but this time no "flash device error" and all VM's/Dockers was running fine. I'll post diagnostics if it happens again.

     

     

     

     

     

     

     

     

    Edited by Ancan
    Link to comment


    Join the conversation

    You can post now and register later. If you have an account, sign in now to post with your account.
    Note: Your post will require moderator approval before it will be visible.

    Guest
    Add a comment...

    ×   Pasted as rich text.   Restore formatting

      Only 75 emoji are allowed.

    ×   Your link has been automatically embedded.   Display as a link instead

    ×   Your previous content has been restored.   Clear editor

    ×   You cannot paste images directly. Upload or insert images from URL.


  • Status Definitions

     

    Open = Under consideration.

     

    Solved = The issue has been resolved.

     

    Solved version = The issue has been resolved in the indicated release version.

     

    Closed = Feedback or opinion better posted on our forum for discussion. Also for reports we cannot reproduce or need more information. In this case just add a comment and we will review it again.

     

    Retest = Please retest in latest release.


    Priority Definitions

     

    Minor = Something not working correctly.

     

    Urgent = Server crash, data loss, or other showstopper.

     

    Annoyance = Doesn't affect functionality but should be fixed.

     

    Other = Announcement or other non-issue.