May 21, 201610 yr Primary Problem The use of docker containers, running on a vm discrete from unraid, using mount points on an unraid user share result in frequent "stale file handle" errors in the docker container. Restarting the docker container is the only way to restore functionality. Configuration All in one esx box with unraid virtualized using passthrough of an HBA Several VMs (centos 7) using autofs to mount nfs shares on unraid Docker (version 1.11.1) running on a discrete vm from unraid Unraid Version 6.1.9 History There has been a long history of "Stale NFS handle" errors from my VMs with mount points on unraid. This was resolved by using automount. This worked when the services accessing the mount points were running natively on the OS (centos 7) . I've been experimenting with docker and noticed that volumes I mount for the docker containers would become inaccessible. This happens somewhere between several minutes or several hours. The errors thrown by the different docker containers are consistent, "Stale file handle" Attempted Steps to resolve the issue Set timeout of 0 (indefinite) for automount. So the volumes are never unmounted Switched from automount to mounts via fstab Switched from NFS to CIFS shares Set the the user share "downloads", which seems to experience the most problems. To only live on the cache drive Steps left to try Convert from RFS to XFS. In my research I came across a post somewhere on the internet that the stale file handle could be an RFS problem. Though at the moment I can't find the post. Additional Information I know there is a drive that is throwing read errors in my unraid array. I'm in the process of removing it. At this time there is nothing being stored on that drive. I'm following the steps outlined https://lime-technology.com/forum/index.php?topic=37431.msg346187#msg346187 to remove the drive and to keep the parity in sync while doing so. Edit: change icon
May 22, 201610 yr Author Primary Problem The use of docker containers, running on a vm discrete from unraid, using mount points on an unraid user share result in frequent "stale file handle" errors in the docker container. Restarting the docker container is the only way to restore functionality. Configuration All in one esx box with unraid virtualized using passthrough of an HBA Several VMs (centos 7) using autofs to mount nfs shares on unraid Docker (version 1.11.1) running on a discrete vm from unraid Unraid Version 6.1.9 History There has been a long history of "Stale NFS handle" errors from my VMs with mount points on unraid. This was resolved by using automount. This worked when the services accessing the mount points were running natively on the OS (centos 7) . I've been experimenting with docker and noticed that volumes I mount for the docker containers would become inaccessible. This happens somewhere between several minutes or several hours. The errors thrown by the different docker containers are consistent, "Stale file handle" Attempted Steps to resolve the issue Set timeout of 0 (indefinite) for automount. So the volumes are never unmounted Switched from automount to mounts via fstab Switched from NFS to CIFS shares Set the the user share "downloads", which seems to experience the most problems. To only live on the cache drive Steps left to try Convert from RFS to XFS. In my research I came across a post somewhere on the internet that the stale file handle could be an RFS problem. Though at the moment I can't find the post. Additional Information I know there is a drive that is throwing read errors in my unraid array. I'm in the process of removing it. At this time there is nothing being stored on that drive. I'm following the steps outlined https://lime-technology.com/forum/index.php?topic=37431.msg346187#msg346187 to remove the drive and to keep the parity in sync while doing so. Update My fuse_remember is set to 330 Client mounts under automount look like "photos -fstype=nfs,rw,async,vers=3,lookupcache=none,noac,tcp 10.101.21.24:/mnt/user/photos" Client mounts using fstab are "10.101.21.24:/mnt/user/TV /mnt/TV nfs defaults,nolock 0 0" Client mounts using cifs are "//tower/downloads /mnt/downloads cifs gid=100,uid=99,credentials=/root/.smbcredentials,iocharset=utf8,sec=ntlm 0 0" I built a centos 7 vm running nfs v4. I've mounted a share on the new nfs server from the same client that was throwing the "stale file handle error". It hasn't thrown the error in 3 hours. Which is looking better than what I was experiencing with unraid's nfs server.
May 23, 201610 yr I've moved this to Defect Reports, because it sounds like you have found a new way to cause stale handles. We haven't seen that issue in awhile, used to be associated with NFS, but was fixed. Hopefully here, there's a better chance that LimeTech will see it, and when they have time, try to replicate it.
June 26, 201610 yr On the GUI, Go to 'Settings', 'NFS' icon. Now tun on the 'Help' by click on the right of the Toolbar at the top of the screen. Do you see this section in the Help for the 'Tunable (fuse_remember):' setting? A value of -1 would be appropriate if no other timeout seems to solve the "stale file handle" on your client. Be aware that setting a value of -1 will cause the memory footprint to grow by approximatel 108 bytes per file/directory name cached. Depending how much RAM is installed in your server and how many files/directories you access via NFS this may or may not lead to out-of-memory conditions. If you are still having the issue, does this setting help? (It would probably be good idea to have a lot of RAM installed if you make this setting...)
June 26, 201610 yr Author Thanks for the reply. I can't confirm if your suggestion on changing fuse_remember works. I moved from unraid to freenas. NFS stability is critical to my infrastructure.
December 19, 20169 yr I'm getting stale file handle even with fuse_remember set to 0. Any help? Running 6.2.4 and it's a Debian VM getting the error when I try to cd into the share after it has sat for a while. Here is fstab from VM: ELSA-UNRAID:/mnt/user/misc/shared/a5000/ /mnt/user/a5000 nfs rsize=8192,wsize=8192,timeo=14,intr 0 0 Sent from my Pixel XL using Tapatalk
December 22, 20169 yr I'm getting stale file handle even with fuse_remember set to 0. Any help? Running 6.2.4 and it's a Debian VM getting the error when I try to cd into the share after it has sat for a while. Here is fstab from VM: ELSA-UNRAID:/mnt/user/misc/shared/a5000/ /mnt/user/a5000 nfs rsize=8192,wsize=8192,timeo=14,intr 0 0 Sent from my Pixel XL using Tapatalk That's expected behavior for fuse_remember 0. Try default of 330.
December 22, 20169 yr Sorry, I had it at -1, not 0. It all started when it was 330, but I just changed it back to 330 and will try again.
Archived
This topic is now archived and is closed to further replies.