I'm wondering what ever made you think that there is a connection between the nfs stale file handles and the transport endpoint problems.
To be fair ... I was thinking in my head about some plex forum posts trying to link the unRAID Plex media server crash with NFS fuse related issues and accessing large number of files.
In Retrospect these *could* be 2 unrelated issues with fuse that result in loss of NFS Access;
1). Accessing user share folders with large numbers of files via NFS => fuse triggers stale_file_handles => NFS loss of access.
2). Accessing user share folders with large numbers of files (via Remote Plex Clients or local http Consol) That triggers plex media server running in unRAID user space to access the files locally => fuse goes la la and NFS mounts to the user shares become unavailable as a side effect.
I can trigger the problem with the following load in 5min 58 sec where I believe that PID 15887 was plex media server
21178 shfs_getattr: ,PID: 15887
20206 shfs_write: ,PID: 19517
20206 shfs_getxattr: ,PID: 19517
18418 shfs_getxattr: ,PID: 15887
8900 shfs_read: ,PID: 15887
7292 shfs_release: ,PID: 0
5640 shfs_getattr: ,PID: 19490
4914 shfs_getattr: ,PID: 19344
4560 shfs_open: ,PID: 15887
4555 shfs_flush: ,PID: 15887
3517 shfs_getattr: ,PID: 19350
3154 shfs_getattr: ,PID: 15942
2376 shfs_read: ,PID: 19490
2284 shfs_open: ,PID: 19490
2284 shfs_flush: ,PID: 19490
2208 shfs_readdir: ,PID: 15887
812 shfs_readdir: ,PID: 19490
I have emailed the log file to Tom (thanks for your help m8 its appreciated).
I'll download the source for fuse and start to educate myself about it;
One thing I have spotted is that just before fuse appears to barf .. there is a lot of the following syslog entries;
Jun 7 11:52:38 unRAID shfs/user: assign_disk_high_water: disk1 size 122092910 free 68001250
Jun 7 11:52:38 unRAID shfs/user: assign_disk_high_water: disk2 size 122092910 free 93972738
Jun 7 11:52:38 unRAID shfs/user: assign_disk: disk_path: /mnt/disk1
#####################################
UPDATE: SOLVED ?
It looks like its *not* Plex Media Servers access to the Folders with large number of files in it that triggers the plex\fuse problem.
Its Plex Media servers "own" access to its own local cache that triggers the problem when accessing the phototranscoder cache;
plex/tmp/Library/Application Support/Plex Media Server/Cache/PhotoTranscoder/
It seems to be a problem when plex tries to write to the fuse user share
"shfs_create: real_path"
This is what happens just before the NFS\Fuse loss of access and Plex hang\crash, all this activity happened in 1 second in the plex transcoder cache folder (user share)
10 shfs/user:,shfs_create:,pid:
10 shfs/user:,shfs_create:,real_path:
16 shfs/user:,shfs_flush:,pid:
31 shfs/user:,shfs_getattr:,lookup:
64 shfs/user:,shfs_getattr:,pid:
9 shfs/user:,shfs_getxattr:,getxattr:
9 shfs/user:,shfs_getxattr:,pid:
10 shfs/user:,shfs_open:,pid:
16 shfs/user:,shfs_release:,pid:
6 shfs/user:,shfs_rename:,pid:
10 shfs/user:,shfs_truncate:,pid:
9 shfs/user:,shfs_write:,pid:
>> When I configure plex media server's library and temp folders at the native disk mount
/mnt/disk1/plex/tmp/Library instead of /mnt/user/plex/tmp/Library
/mnt/disk1/plex/tmp instead of /mnt/user/plex/tmp
I can no longer recreate the problem of fuse user shares going la la and killing the NFS mounts ! Yea :-)
All I have to do now is run fuse in debug and see why plex is able to killl the user share with seemingly such little activity !!!!