• /mnt/user dissappeared


    Can0n
    • Retest Urgent

    i just got an error on my Plex docker that please make sure the drive is attached.  I found that /mnt/user is not showing in the docker container but when i do an ls -l in the CLI i have some weird permissions screenshots and diagnostics attached

    cli.png

    docker.png

    thor-diagnostics-20180920-1011.zip




    User Feedback

    Recommended Comments



    48 minutes ago, trurl said:

    Download 6.5.3 from the Downloads link at the upper right of this forum. Unzip the download and replace all the bz* files on your flash with the ones from the download.

    downloaded from https://unraid.net/download and updated the bz files rebooted and back on 6.5.3 thank you so much

    Edited by Can0nfan
    Link to comment
    40 minutes ago, trurl said:

    Download 6.5.3 from the Downloads link at the upper right of this forum. Unzip the download and replace all the bz* files on your flash with the ones from the download.

    i will try thank you

    Link to comment

    Your diagnostic file has many warnings about the "unassigned devices" plugin unable to communicate.

    Did you start your system in safe mode to rule out any faul play by plugins?

    Link to comment
    6 hours ago, Can0nfan said:

    i downgraded and my server only lasted like 2min and went hard down

    That means upgrading to 6.6.0 was a red herring. There is a deeper issue with your server.

     

    As I originally asked and @bonienl reiterated, please boot in safe mode (non GUI) to rule out any plugin issue.

    Link to comment

    Probably this issue is related to NFS mounts of directories under /mnt/user/sharename.  Do you have to use NFS in this manner?  Why not use SMB?  Using UD to do this is untested by anyone at LimeTech.

    Link to comment
    40 minutes ago, limetech said:

    Probably this issue is related to NFS mounts of directories under /mnt/user/sharename.  Do you have to use NFS in this manner?  Why not use SMB?  Using UD to do this is untested by anyone at LimeTech.

    i use NFS as my other unRAID server has a Fedora Server VM running as a reverse proxy for sonarr and radarr which needs the mounts on this server to dump the downloads to. as far as I know fedora doesn't support samba and so the mounts are nfs

    Link to comment
    14 minutes ago, jonathanm said:

    what am i doing wrong? to simplify i set my Media Share to public and this is what i get

     

    [michael@proxybox ~]$ sudo mount -t cifs -o //10.0.0.87/mnt/user/Media/TV /home/raid/TV                                                                                                       
    mount.cifs: bad UNC (10.0.0.87:/mnt/user/Media/TV)                                            
    Link to comment

    getting this now...sorry my troubleshooting of CLI is very limited
     

    [michael@proxybox ~]$ sudo mount -t cifs -o //10.0.0.87/Media/TV /home/raid/TV                                                                                                                
    [sudo] password for michael:                                                                                                                                                                  
    mount.cifs: bad UNC (10.0.0.87:/mnt/user/Media/TV)      

     

    by comparison here is how NFS mount is looking (unmounted while i try to mount of smb)

     

     

     

                                                                                                                                          

    nfs.png

    Edited by Can0nfan
    Link to comment

    When using SMB you use server/sharename (I.e. omit the /mnt/user part which is not visible at the Samba Level)

    Edited by itimpi
    Link to comment
    6 minutes ago, itimpi said:

    When using SMB you do not use server/sharename (I.e. omit the /mnt/user part which is not visible at the Samba Level)

    just did

     

    michael@proxybox ~]$ sudo mount -t cifs -o //10.0.0.87/Media/TV /home/raid/TV                                                                                                                
    [sudo] password for michael:                                                                                                                                                                  
    mount.cifs: bad UNC (10.0.0.87:/mnt/user/Media/TV)      
     
     
    yellow is the output after i type my password
    Edited by Can0nfan
    Link to comment

    regardless of my issues mounting in SBM, NFS mounts it work great in unRAID 6.5.3 why is it broken causing such huge issues with the user/ folder for unRAID 6.6???

    Link to comment
    5 minutes ago, limetech said:

    don't know

    i hope my diagnostic posted earlier can help find the answer love the new 6.6 but i cant run it on a server clients connect to with /user keeps "disappearing" im back to 6.5.3 and stable for over 12 hours now

    Edited by Can0nfan
    Link to comment

    add to that i sent 5 mellanox infiniband cards to Eric so that he can try and integrate the drivers into the kernal (which looks like it wont be until 6.7 by the sounds of it.) so i can have my two unraid server transfer backups between each other at upto 40gbps as the current 1gbps is far too slow

    Link to comment

    @limetech @bonienl @jonp

     

    Is this issue on your radar and being worked? Is there anything we can provide you to help you do so? Thanks!

     

    This is definitely a recurring issue. And unfortunately my application doesn't like use of SMB shares.
     
    It seems to be caused by use of NFS shares. 6.6.0 runs fairly stably until I mount a NFS share. After anywhere between 10 minutes to 3 hours, my /mnt/user folder disappears which creates a cascade of chaos. All the shares disappear which in turn breaks the NFS connection and any other application using the shares including the docker containers.
     
    I believe there's some sort of memory issue between a shfs process running on the unraid server and the nfsd. I'm unfamiliar with the implementation of shfs that's running on the unraid server and can't find any online documentation to help me troubleshoot further.
     
    The process that actually uses the /mnt/user mount point is: /usr/local/sbin/shfs /mnt/user -disks 63 2048000000 -o noatime,big_writes,allow_other -o remember=330 |& logger
     
    The proceeding process fails for some reason when the nfsd crashes with the following error:
    Sep 20 02:40:01 systemname rpcbind[121456]: connect from 10.10.10.18 to getport/addr(nlockmgr)
    Sep 20 02:45:01 systemname rpcbind[124301]: connect from 10.10.10.18 to getport/addr(nlockmgr)
    Sep 20 02:48:46 systemname kernel: ------------[ cut here ]------------
    Sep 20 02:48:46 systemname kernel: nfsd: non-standard errno: -107
    Sep 20 02:48:46 systemname kernel: WARNING: CPU: 1 PID: 3577 at fs/nfsd/nfsproc.c:817 nfserrno+0x44/0x4a [nfsd]
    Sep 20 02:48:46 systemname kernel: Modules linked in: veth xt_nat macvlan ipt_MASQUERADE iptable_nat nf_conntrack_ipv4 nf_defrag_ipv4 nf_nat_ipv4 iptable_filter ip_tables nf_nat xfs nfsd lockd grace sunrpc md_mod sb_edac kvm_intel kvm crct10dif_pclmul crc32_pclmul crc32c_intel ghash_clmulni_intel pcbc aesni_intel aes_x86_64 crypto_simd cryptd isci libsas glue_helper e1000e intel_agp intel_gtt i2c_piix4 ahci intel_rapl_perf vmxnet3 scsi_transport_sas i2c_core ata_piix libahci agpgart button
    Sep 20 02:48:46 systemname kernel: CPU: 1 PID: 3577 Comm: nfsd Not tainted 4.18.8-unRAID #1
    Sep 20 02:48:46 systemname kernel: Hardware name: VMware, Inc. VMware Virtual Platform/440BX Desktop Reference Platform, BIOS 6.00 04/05/2016
    Sep 20 02:48:46 systemname kernel: RIP: 0010:nfserrno+0x44/0x4a [nfsd]
    Sep 20 02:48:46 systemname kernel: Code: c0 48 83 f8 22 75 e2 80 3d b3 06 01 00 00 bb 00 00 00 05 75 17 89 fe 48 c7 c7 3b ea 27 a0 c6 05 9c 06 01 00 01 e8 8a 9c dd e0 <0f> 0b 89 d8 5b c3 48 83 ec 18 31 c9 ba ff 07 00 00 65 48 8b 04 25
    Sep 20 02:48:46 systemname kernel: RSP: 0018:ffffc90002253db8 EFLAGS: 00010286
    Sep 20 02:48:46 systemname kernel: RAX: 0000000000000000 RBX: 0000000005000000 RCX: 0000000000000007
    Sep 20 02:48:46 systemname kernel: RDX: 0000000000000000 RSI: ffff88042d656470 RDI: ffff88042d656470
    Sep 20 02:48:46 systemname kernel: RBP: ffffc90002253e08 R08: 0000000000000003 R09: ffff88043ff05700
    Sep 20 02:48:46 systemname kernel: R10: 0000000000000671 R11: 000000000002273c R12: ffff880428387808
    Sep 20 02:48:46 systemname kernel: R13: ffff8804086e2a58 R14: 0000000000000001 R15: ffffffffa027e2a0
    Sep 20 02:48:46 systemname kernel: FS: 0000000000000000(0000) GS:ffff88042d640000(0000) knlGS:0000000000000000
    Sep 20 02:48:46 systemname kernel: CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
    Sep 20 02:48:46 systemname kernel: CR2: 000000c4200d6000 CR3: 0000000001e0a005 CR4: 00000000000606e0
    Sep 20 02:48:46 systemname kernel: Call Trace:
    Sep 20 02:48:46 systemname kernel: nfsd_open+0x15e/0x17c [nfsd]
    Sep 20 02:48:46 systemname kernel: nfsd_write+0x4c/0xaa [nfsd]
    Sep 20 02:48:46 systemname kernel: nfsd3_proc_write+0xad/0xdb [nfsd]
    Sep 20 02:48:46 systemname kernel: nfsd_dispatch+0xb4/0x169 [nfsd]
    Sep 20 02:48:46 systemname kernel: svc_process+0x4b5/0x666 [sunrpc]
    Sep 20 02:48:46 systemname kernel: ? nfsd_destroy+0x48/0x48 [nfsd]
    Sep 20 02:48:46 systemname kernel: nfsd+0xeb/0x142 [nfsd]
    Sep 20 02:48:46 systemname kernel: kthread+0x10b/0x113
    Sep 20 02:48:46 systemname kernel: ? kthread_flush_work_fn+0x9/0x9
    Sep 20 02:48:46 systemname kernel: ret_from_fork+0x35/0x40
    Sep 20 02:48:46 systemname kernel: ---[ end trace 51a513aa08ead34a ]---
    Link to comment
    5 hours ago, edgedog said:

    Is this issue on your radar and being worked? Is there anything we can provide you to help you do so? Thanks!

    Yes, you should already know we need diagnostics.zip not just a syslog snippet which barely helps to troubleshoot anything.

     

    Also it appears you are running Unraid in an ESXi virtual machine.  We cannot reproduce this exact issue because we cannot duplicate your exact config.  That said, it's possible you are running out of memory.  This is because NFS uses an archaic concept called "file handles" which is a numeric value that maps to a file, instead of a path.  In a lot of file systems this maps to the inode number.  In 'shfs' there are no fixed inodes that correspond to files.  Instead inodes are generated and kept in memory by FUSE.  That "remember=330" mount option tells FUSE to keep these inodes in memory for 5 1/2 minutes.  This was chosen because the typical modern NFS client will cache file handles for 5 minutes.  If the client asks for I/O on that handle within 5 min and the handle is no longer valid, you get "stale file handle" messages.  After 5 min, the client typically uses a path to re-read the file handle.  However you can open alot of files in 5 minutes.  This is made worse if you have something like 'cache_dirs' plugin running against shfs mount points.  Maybe try increasing memory allotted to the VM and/or reduce that 'remember' value.

     

    On the other hand, it could be an entirely different issue, don't have enough info to determine this.

    Link to comment

    Same here - docker cpu usage pegs all cores to 100% then all mounts in /mnt/user disappear.  
    I'd attach a diag, but I rolled back in order to get storage back on line.  Apologies. 

    FYI: rolled back to 6.5.3, using the same containers and it's stable. 

    Link to comment

    I'm having the same issue too. In syslog it looks like NFS crashes and then /mnt/user is inaccessible, having to reboot the whole system to get it back. I'm going to downgrade to 6.5 for now.

    Link to comment
    21 hours ago, limetech said:

    Yes, you should already know we need diagnostics.zip not just a syslog snippet which barely helps to troubleshoot anything.

     

    Also it appears you are running Unraid in an ESXi virtual machine.  We cannot reproduce this exact issue because we cannot duplicate your exact config.  That said, it's possible you are running out of memory.  This is because NFS uses an archaic concept called "file handles" which is a numeric value that maps to a file, instead of a path.  In a lot of file systems this maps to the inode number.  In 'shfs' there are no fixed inodes that correspond to files.  Instead inodes are generated and kept in memory by FUSE.  That "remember=330" mount option tells FUSE to keep these inodes in memory for 5 1/2 minutes.  This was chosen because the typical modern NFS client will cache file handles for 5 minutes.  If the client asks for I/O on that handle within 5 min and the handle is no longer valid, you get "stale file handle" messages.  After 5 min, the client typically uses a path to re-read the file handle.  However you can open alot of files in 5 minutes.  This is made worse if you have something like 'cache_dirs' plugin running against shfs mount points.  Maybe try increasing memory allotted to the VM and/or reduce that 'remember' value.

     

    On the other hand, it could be an entirely different issue, don't have enough info to determine this.

    Yes sir. I submitted my non-anonymized diagnostics.zip through the unraid GUI's feedback/bug report feature on 9/20/2018 a little after 11am UTC. I haven't heard from anyone regarding that submission so that was probably the incorrect way to submit it. I'm sorry for my ignorance. If there's a better way to get you the info, please let me know.

     

    Thanks for the information about how nfs and shfs works.  At the time of the diagnostics.zip, I was booted in safe mode and my unraid vm had 16GB of RAM allocated with 13GB of that available for use. I had subsequently increased the VM RAM to 40GB for test purposes and continued to experience the crashes. I don't believe there's a lack of memory unless nfsd or shfs is unable to acquire available memory for some reason. But I'm definitely willing to test your theory by modifying the remember parameter of the shfs process. Where is the file that I should modify that sets that parameter? I've scoured the filesystem but have been unable to find it.

     

    Thanks a bunch for responding!

     

     

    Link to comment



    Join the conversation

    You can post now and register later. If you have an account, sign in now to post with your account.
    Note: Your post will require moderator approval before it will be visible.

    Guest
    Add a comment...

    ×   Pasted as rich text.   Restore formatting

      Only 75 emoji are allowed.

    ×   Your link has been automatically embedded.   Display as a link instead

    ×   Your previous content has been restored.   Clear editor

    ×   You cannot paste images directly. Upload or insert images from URL.


  • Status Definitions

     

    Open = Under consideration.

     

    Solved = The issue has been resolved.

     

    Solved version = The issue has been resolved in the indicated release version.

     

    Closed = Feedback or opinion better posted on our forum for discussion. Also for reports we cannot reproduce or need more information. In this case just add a comment and we will review it again.

     

    Retest = Please retest in latest release.


    Priority Definitions

     

    Minor = Something not working correctly.

     

    Urgent = Server crash, data loss, or other showstopper.

     

    Annoyance = Doesn't affect functionality but should be fixed.

     

    Other = Announcement or other non-issue.