MvL Posted April 2, 2020 Posted April 2, 2020 (edited) I had a strange issue and I like to know the possibilities causing it. My "user" directory vanished in my /mnt directory. The "user0" directory was colored red. I discovered it when my Docker containers where failing. I don't have a log file because I forgot to generate one before I rebooted. Edited April 2, 2020 by MvL Quote
MvL Posted April 2, 2020 Author Posted April 2, 2020 It happened again and now I have downloaded the diagnostic file! I have to correct my case the user directory is not gone but it is colored red. Also the containers are not working anymore. I have to reboot the server to make it work again. Has this something to do with the overlay file system? I think the problems start with a NAS which was mounted via unassigned devices. I powered down the NAS before I unmounted the share in unassigned devices. I think there goes something wrong. tower-diagnostics-20200402-2007.zip Quote
JorgeB Posted April 2, 2020 Posted April 2, 2020 This is what made /mnt/user go way: Apr 2 20:01:48 Tower shfs: shfs: ../lib/fuse.c:1451: unlink_node: Assertion `node->nlookup > 1' failed. But no idea what caused it or what it means exaclty, you could try running for a day or so without any dockers/VMs, if no issues start enabling them one by one. Quote
MvL Posted April 2, 2020 Author Posted April 2, 2020 Hi Johnnie, thanks for having a look. I have the impression that it happens when I mount my NAS with unassigned devices and forget to unmount it before I switch it off. I'm absolutely not certain if this is the problem. To start I don't mount any remote shares then if it happens again I will start with the containers and VM's. Quote
MvL Posted April 2, 2020 Author Posted April 2, 2020 In this post they also mention the error. https://forums.unraid.net/topic/77422-666-plex-smb-shares-unresponsive/ Quote
MvL Posted April 3, 2020 Author Posted April 3, 2020 (edited) Okay it happened again.. Quote Apr 3 11:34:07 Tower shfs: shfs: ../lib/fuse.c:1451: unlink_node: Assertion `node->nlookup > 1' failed. I was messing with 2 containers when it happened. To be clear nothing was mounted via unassigned devices. I was messing with linuxserver/mariadb and linuxserver/nextcloud. Yesterday when it happened I was also messing with containers, linuxserver/mariadb and linuxserver/piwigo. So if it is container problem than it must be the linuxserver/mariadb or there is something wrong with Docker in combination with SHFS? What uses SHFS? Unassigned devices? If it is the linuxserver/mariadb conainer there should be more reports? Update: No of course SHFS is also part of unRAID. df -h df: /mnt/user: Transport endpoint is not connected Filesystem Size Used Avail Use% Mounted on rootfs 63G 640M 63G 1% / tmpfs 32M 372K 32M 2% /run devtmpfs 63G 0 63G 0% /dev tmpfs 63G 0 63G 0% /dev/shm cgroup_root 8.0M 0 8.0M 0% /sys/fs/cgroup tmpfs 128M 392K 128M 1% /var/log /dev/sda1 29G 438M 29G 2% /boot /dev/loop0 9.2M 9.2M 0 100% /lib/modules /dev/loop1 7.3M 7.3M 0 100% /lib/firmware tmpfs 1.0M 0 1.0M 0% /mnt/disks /dev/md1 9.1T 9.1T 66M 100% /mnt/disk1 /dev/md2 9.1T 9.1T 7.3G 100% /mnt/disk2 /dev/md3 9.1T 867G 8.3T 10% /mnt/disk3 /dev/md5 9.1T 9.0T 135G 99% /mnt/disk5 /dev/md6 9.1T 9.0T 182G 99% /mnt/disk6 /dev/md7 9.1T 7.6T 1.6T 83% /mnt/disk7 /dev/md8 9.1T 7.3T 1.9T 80% /mnt/disk8 /dev/md9 9.1T 7.4T 1.7T 82% /mnt/disk9 /dev/md10 9.1T 5.5T 3.7T 60% /mnt/disk10 /dev/md13 9.1T 3.3T 5.9T 36% /mnt/disk13 /dev/sdb1 448G 336G 111G 76% /mnt/cache shfs 91T 68T 24T 75% /mnt/user0 ********************************* /dev/sdi1 1.9T 156G 1.7T 9% /mnt/disks/downloads /dev/sdc1 1.9T 1.4T 520G 73% /mnt/disks/WDC_WD20EARS-00MVWB0_WD-WMAZA2219709 /dev/loop2 64G 3.2G 59G 6% /var/lib/docker /dev/loop3 1.0G 17M 905M 2% /etc/libvirt I think I miss the "/mnt/user"? I have to compare that when I have rebooted the server. Edited April 3, 2020 by MvL Quote
JorgeB Posted April 3, 2020 Posted April 3, 2020 43 minutes ago, MvL said: I think I miss the "/mnt/user"? You do, try to confirm which of those containers is causing this, then you can post on the docker support thread. Quote
MvL Posted April 3, 2020 Author Posted April 3, 2020 (edited) Indeed it is missing! root@Tower:~# df -h Filesystem Size Used Avail Use% Mounted on rootfs 63G 627M 63G 1% / tmpfs 32M 372K 32M 2% /run devtmpfs 63G 0 63G 0% /dev tmpfs 63G 0 63G 0% /dev/shm cgroup_root 8.0M 0 8.0M 0% /sys/fs/cgroup tmpfs 128M 308K 128M 1% /var/log /dev/sda1 29G 438M 29G 2% /boot /dev/loop0 9.2M 9.2M 0 100% /lib/modules /dev/loop1 7.3M 7.3M 0 100% /lib/firmware tmpfs 1.0M 0 1.0M 0% /mnt/disks /dev/md1 9.1T 9.1T 66M 100% /mnt/disk1 /dev/md2 9.1T 9.1T 7.3G 100% /mnt/disk2 /dev/md3 9.1T 867G 8.3T 10% /mnt/disk3 /dev/md5 9.1T 9.0T 135G 99% /mnt/disk5 /dev/md6 9.1T 9.0T 182G 99% /mnt/disk6 /dev/md7 9.1T 7.6T 1.6T 83% /mnt/disk7 /dev/md8 9.1T 7.3T 1.9T 80% /mnt/disk8 /dev/md9 9.1T 7.4T 1.7T 82% /mnt/disk9 /dev/md10 9.1T 5.5T 3.7T 60% /mnt/disk10 /dev/md13 9.1T 3.3T 5.9T 36% /mnt/disk13 /dev/sdb1 448G 336G 111G 76% /mnt/cache shfs 91T 68T 24T 75% /mnt/user0 shfs 92T 68T 24T 75% /mnt/user /dev/sdi1 1.9T 156G 1.7T 9% /mnt/disks/downloads /dev/sdc1 1.9T 1.4T 520G 73% /mnt/disks/WDC_WD20EARS-00MVWB0_WD-WMAZA2219709 /dev/loop2 64G 3.2G 59G 6% /var/lib/docker The MariaDB is the only container I also used yesterday when I was messing around. Going to try another database. Let see what happens! (Don't wanna report something if I'm not completely sure). Edited April 3, 2020 by MvL Quote
itimpi Posted April 3, 2020 Posted April 3, 2020 What is missing? /mnt/user is showing in the screenshot! Quote
MvL Posted April 3, 2020 Author Posted April 3, 2020 The latest picture is after a reboot so everything is normal again... 7 minutes ago, itimpi said: What is missing? /mnt/user is showing in the screenshot! If you look to the first picture then it is missing. I also have this error in the logs: Quote Apr 3 11:34:07 Tower shfs: shfs: ../lib/fuse.c:1451: unlink_node: Assertion `node->nlookup > 1' failed. Quote
acbaldwi Posted April 13, 2020 Posted April 13, 2020 Bumping for an answer, i tto see that error message "shfs: shfs: ../lib/fuse.c:1451: unlink_node: Assertion `node->nlookup > 1' failed." just before all of my shares go "offline" i have to reboot to get it back online Quote
Jagadguru Posted April 17, 2020 Posted April 17, 2020 It seems to be caused by this: https://github.com/libfuse/libfuse/issues/128. A file is removed and then moved, presumably by Mover. I don't know what the fix is, though. Quote
niavasha Posted May 20, 2020 Posted May 20, 2020 Mover disabled for me but I just posted my details in reply to Quote
mbc0 Posted September 7, 2020 Posted September 7, 2020 I am also suffering from this issue, I can confirm 100% the mover was not running when it happened last as I was using the server the moment it happened and the mover finished 6-7 hours prior. Quote
scorcho99 Posted October 22, 2021 Posted October 22, 2021 Ran into this problem for the first time today. I was renaming and deleting some empty files on the cache drive when I lost access to all shares. I had a script that was adding and deleting in the same directory. I was using a samba share but the script was on unraid so not samba. I'm using 6.8.3. Weirdly (and fortunately) VMs were still running fine and rebooting appears to have restoreed everything. Quote
VBilbo Posted October 25, 2021 Posted October 25, 2021 I am having the same issue. Started when I installed TDARR docker. Presumably it is because it transcodes files, renames and moves them from the same directory. Quote
scorcho99 Posted December 10, 2021 Posted December 10, 2021 (edited) This issue seems to pop up for me and I haven't seen it before. What I changed recently was I added a script that renames a file, actually a couple of times and then runs shred on it. Not sure if its the shred or the rename. The file remains in the same directory, I'm just changing it's extension. I mention this because I read another post where some one mentioned renames in the same dir related to this issue. I am using 6.8.3. I'm going to experiment with removing the mv (rename really) command. Edited December 10, 2021 by scorcho99 Quote
turnipisum Posted December 23, 2021 Posted December 23, 2021 I've had mnt/user disappear on me twice this month! Found via docker errors. restart resolves issue. Quote
Squid Posted December 23, 2021 Posted December 23, 2021 4 minutes ago, turnipisum said: restart resolves issue Not the "resolution" Next time that happens, post your diagnostics before rebooting. Quote
scorcho99 Posted January 12, 2022 Posted January 12, 2022 Well, this happened to me again today. I thought it was running the shred command that did it but that isn't involved into today's case. I believe the trigger is again a script that using "mv" to rename a file and then deletes it. I'll try removing that part of the action and seeing it happens again. The shred command actually probably performs a similar action when unlinking a file name so they might be the same type of cause. Attached are diagnostics in the degraded state. VMs seem to stay running if they were up and I actually still have access to an unassigned device share. Not 100% sure this is the only case, but the files are being changed on a share on my cache drive, with is a SSD and uses btrfs. I only have one docker, a minecraft server, and I haven't actually run it in months. I have an NFS share, not sure that is relevant but I only enabled that in the past couple months and it sort of matches the timeline of when this started occurring. But the files are being edited through samba shares, not NFS. tower-diagnostics-20220112-0934.zip Quote
JorgeB Posted January 12, 2022 Posted January 12, 2022 Jan 12 09:12:04 Tower shfs: shfs: ../lib/fuse.c:1451: unlink_node: Assertion `node->nlookup > 1' failed. https://forums.unraid.net/bug-reports/stable-releases/683-shfs-error-results-in-lost-mntuser-r939/ Some workarounds discussed there, mostly disable NFS if not needed or you can change everything to SMB, can also be caused by Tdarr if you use that. Quote
scorcho99 Posted January 12, 2022 Posted January 12, 2022 I'd rather not disable NFS since it's solving a problem I had with samba shares. I already disabled hardlinks awhile back. Does it make sense that NFS is involved if nothing is interacting with any NFS shares at the time of failure? The share is mounted in a single VM that was not running. And actually...looking back its not even clear if I'd enabled NFS when I first had this problem. I don't use Tdarr. Quote
JorgeB Posted January 12, 2022 Posted January 12, 2022 1 hour ago, scorcho99 said: I believe the trigger is again a script that using "mv" to rename a file and then deletes it. This is probably the problem, you can get likely around that if you can run the script on a disk share vs user share, or /mnt/user0 vs /mnt/user if the files are on the array. Quote
scorcho99 Posted January 12, 2022 Posted January 12, 2022 (edited) 1 hour ago, JorgeB said: This is probably the problem, you can get likely around that if you can run the script on a disk share vs user share, or /mnt/user0 vs /mnt/user if the files are on the array. I reworked the script to instead copy and delete the original since that was a workable solution for this case. I think I figured out why shred was so effective at triggering it all of a sudden. I was using the "-u" option. If you look at the help for shred, by default that renames the file before deleting it! (rather it says it "obsfucates the filename before unlinking". I bet if I drop that option it will be OK. Good ideas on the disk shares, not sure I'll go that route but I think it could be made to work. Edited January 12, 2022 by scorcho99 Quote
Recommended Posts
Join the conversation
You can post now and register later. If you have an account, sign in now to post with your account.
Note: Your post will require moderator approval before it will be visible.