/mnt/user dissappeared

Can0n · September 21, 2018

48 minutes ago, trurl said:

Download 6.5.3 from the Downloads link at the upper right of this forum. Unzip the download and replace all the bz* files on your flash with the ones from the download.

downloaded from https://unraid.net/download and updated the bz files rebooted and back on 6.5.3 thank you so much

Edited September 21, 2018 by Can0nfan

Can0n · September 21, 2018

40 minutes ago, trurl said:

Download 6.5.3 from the Downloads link at the upper right of this forum. Unzip the download and replace all the bz* files on your flash with the ones from the download.

i will try thank you

Can0n · September 21, 2018

i downgraded and my server only lasted like 2min and went hard down

Can0n · September 21, 2018

ok got a parity check now but working for over 11min woot a new record lol

bonienl · September 21, 2018

Your diagnostic file has many warnings about the "unassigned devices" plugin unable to communicate.

Did you start your system in safe mode to rule out any faul play by plugins?

testdasi · September 21, 2018

6 hours ago, Can0nfan said:

i downgraded and my server only lasted like 2min and went hard down

That means upgrading to 6.6.0 was a red herring. There is a deeper issue with your server.

As I originally asked and @bonienl reiterated, please boot in safe mode (non GUI) to rule out any plugin issue.

limetech · September 21, 2018

Probably this issue is related to NFS mounts of directories under /mnt/user/sharename. Do you have to use NFS in this manner? Why not use SMB? Using UD to do this is untested by anyone at LimeTech.

Can0n · September 21, 2018

40 minutes ago, limetech said:

Probably this issue is related to NFS mounts of directories under /mnt/user/sharename. Do you have to use NFS in this manner? Why not use SMB? Using UD to do this is untested by anyone at LimeTech.

i use NFS as my other unRAID server has a Fedora Server VM running as a reverse proxy for sonarr and radarr which needs the mounts on this server to dump the downloads to. as far as I know fedora doesn't support samba and so the mounts are nfs

JonathanM · September 21, 2018

17 minutes ago, Can0nfan said:

as far as I know fedora doesn't support samba

https://docs.fedoraproject.org/en-US/Fedora/12/html/Deployment_Guide/s1-samba-mounting.html

Can0n · September 21, 2018

3 minutes ago, jonathanm said:

https://docs.fedoraproject.org/en-US/Fedora/12/html/Deployment_Guide/s1-samba-mounting.html

thanks ill try it not sure the script my buddy made will be happy with sbm but im back on 6.5.3 and its stable with NFS

Can0n · September 21, 2018

14 minutes ago, jonathanm said:

https://docs.fedoraproject.org/en-US/Fedora/12/html/Deployment_Guide/s1-samba-mounting.html

what am i doing wrong? to simplify i set my Media Share to public and this is what i get

[michael@proxybox ~]$ sudo mount -t cifs -o //10.0.0.87/mnt/user/Media/TV /home/raid/TV

mount.cifs: bad UNC (10.0.0.87:/mnt/user/Media/TV)

JonathanM · September 21, 2018

lose the /mnt/user in the remote path

4 minutes ago, Can0nfan said:

//10.0.0.87/Media/TV

Can0n · September 21, 2018

getting this now...sorry my troubleshooting of CLI is very limited

[michael@proxybox ~]$ sudo mount -t cifs -o //10.0.0.87/Media/TV /home/raid/TV

[sudo] password for michael:

mount.cifs: bad UNC (10.0.0.87:/mnt/user/Media/TV)

by comparison here is how NFS mount is looking (unmounted while i try to mount of smb)

Edited September 21, 2018 by Can0nfan

itimpi · September 21, 2018

When using SMB you use server/sharename (I.e. omit the /mnt/user part which is not visible at the Samba Level)

Edited September 21, 2018 by itimpi

Can0n · September 21, 2018

6 minutes ago, itimpi said:

When using SMB you do not use server/sharename (I.e. omit the /mnt/user part which is not visible at the Samba Level)

just did

michael@proxybox ~]$ sudo mount -t cifs -o //10.0.0.87/Media/TV /home/raid/TV

[sudo] password for michael:

mount.cifs: bad UNC (10.0.0.87:/mnt/user/Media/TV)

yellow is the output after i type my password

Edited September 21, 2018 by Can0nfan

Can0n · September 21, 2018

regardless of my issues mounting in SBM, NFS mounts it work great in unRAID 6.5.3 why is it broken causing such huge issues with the user/ folder for unRAID 6.6???

limetech · September 21, 2018

5 minutes ago, Can0nfan said:

why is it broken

don't know

Can0n · September 21, 2018

5 minutes ago, limetech said:

don't know

i hope my diagnostic posted earlier can help find the answer love the new 6.6 but i cant run it on a server clients connect to with /user keeps "disappearing" im back to 6.5.3 and stable for over 12 hours now

Edited September 21, 2018 by Can0nfan

Can0n · September 21, 2018

add to that i sent 5 mellanox infiniband cards to Eric so that he can try and integrate the drivers into the kernal (which looks like it wont be until 6.7 by the sounds of it.) so i can have my two unraid server transfer backups between each other at upto 40gbps as the current 1gbps is far too slow

edgedog · September 24, 2018

@limetech @bonienl @jonp

Is this issue on your radar and being worked? Is there anything we can provide you to help you do so? Thanks!

This is definitely a recurring issue. And unfortunately my application doesn't like use of SMB shares.

It seems to be caused by use of NFS shares. 6.6.0 runs fairly stably until I mount a NFS share. After anywhere between 10 minutes to 3 hours, my /mnt/user folder disappears which creates a cascade of chaos. All the shares disappear which in turn breaks the NFS connection and any other application using the shares including the docker containers.

I believe there's some sort of memory issue between a shfs process running on the unraid server and the nfsd. I'm unfamiliar with the implementation of shfs that's running on the unraid server and can't find any online documentation to help me troubleshoot further.

The process that actually uses the /mnt/user mount point is: /usr/local/sbin/shfs /mnt/user -disks 63 2048000000 -o noatime,big_writes,allow_other -o remember=330 |& logger

The proceeding process fails for some reason when the nfsd crashes with the following error:

Sep 20 02:40:01 systemname rpcbind[121456]: connect from 10.10.10.18 to getport/addr(nlockmgr)

Sep 20 02:45:01 systemname rpcbind[124301]: connect from 10.10.10.18 to getport/addr(nlockmgr)

Sep 20 02:48:46 systemname kernel: ------------[ cut here ]------------

Sep 20 02:48:46 systemname kernel: nfsd: non-standard errno: -107

Sep 20 02:48:46 systemname kernel: WARNING: CPU: 1 PID: 3577 at fs/nfsd/nfsproc.c:817 nfserrno+0x44/0x4a [nfsd]

Sep 20 02:48:46 systemname kernel: Modules linked in: veth xt_nat macvlan ipt_MASQUERADE iptable_nat nf_conntrack_ipv4 nf_defrag_ipv4 nf_nat_ipv4 iptable_filter ip_tables nf_nat xfs nfsd lockd grace sunrpc md_mod sb_edac kvm_intel kvm crct10dif_pclmul crc32_pclmul crc32c_intel ghash_clmulni_intel pcbc aesni_intel aes_x86_64 crypto_simd cryptd isci libsas glue_helper e1000e intel_agp intel_gtt i2c_piix4 ahci intel_rapl_perf vmxnet3 scsi_transport_sas i2c_core ata_piix libahci agpgart button

Sep 20 02:48:46 systemname kernel: CPU: 1 PID: 3577 Comm: nfsd Not tainted 4.18.8-unRAID #1

Sep 20 02:48:46 systemname kernel: Hardware name: VMware, Inc. VMware Virtual Platform/440BX Desktop Reference Platform, BIOS 6.00 04/05/2016

Sep 20 02:48:46 systemname kernel: RIP: 0010:nfserrno+0x44/0x4a [nfsd]

Sep 20 02:48:46 systemname kernel: Code: c0 48 83 f8 22 75 e2 80 3d b3 06 01 00 00 bb 00 00 00 05 75 17 89 fe 48 c7 c7 3b ea 27 a0 c6 05 9c 06 01 00 01 e8 8a 9c dd e0 <0f> 0b 89 d8 5b c3 48 83 ec 18 31 c9 ba ff 07 00 00 65 48 8b 04 25

Sep 20 02:48:46 systemname kernel: RSP: 0018:ffffc90002253db8 EFLAGS: 00010286

Sep 20 02:48:46 systemname kernel: RAX: 0000000000000000 RBX: 0000000005000000 RCX: 0000000000000007

Sep 20 02:48:46 systemname kernel: RDX: 0000000000000000 RSI: ffff88042d656470 RDI: ffff88042d656470

Sep 20 02:48:46 systemname kernel: RBP: ffffc90002253e08 R08: 0000000000000003 R09: ffff88043ff05700

Sep 20 02:48:46 systemname kernel: R10: 0000000000000671 R11: 000000000002273c R12: ffff880428387808

Sep 20 02:48:46 systemname kernel: R13: ffff8804086e2a58 R14: 0000000000000001 R15: ffffffffa027e2a0

Sep 20 02:48:46 systemname kernel: FS: 0000000000000000(0000) GS:ffff88042d640000(0000) knlGS:0000000000000000

Sep 20 02:48:46 systemname kernel: CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033

Sep 20 02:48:46 systemname kernel: CR2: 000000c4200d6000 CR3: 0000000001e0a005 CR4: 00000000000606e0

Sep 20 02:48:46 systemname kernel: Call Trace:

Sep 20 02:48:46 systemname kernel: nfsd_open+0x15e/0x17c [nfsd]

Sep 20 02:48:46 systemname kernel: nfsd_write+0x4c/0xaa [nfsd]

Sep 20 02:48:46 systemname kernel: nfsd3_proc_write+0xad/0xdb [nfsd]

Sep 20 02:48:46 systemname kernel: nfsd_dispatch+0xb4/0x169 [nfsd]

Sep 20 02:48:46 systemname kernel: svc_process+0x4b5/0x666 [sunrpc]

Sep 20 02:48:46 systemname kernel: ? nfsd_destroy+0x48/0x48 [nfsd]

Sep 20 02:48:46 systemname kernel: nfsd+0xeb/0x142 [nfsd]

Sep 20 02:48:46 systemname kernel: kthread+0x10b/0x113

Sep 20 02:48:46 systemname kernel: ? kthread_flush_work_fn+0x9/0x9

Sep 20 02:48:46 systemname kernel: ret_from_fork+0x35/0x40

Sep 20 02:48:46 systemname kernel: ---[ end trace 51a513aa08ead34a ]---

limetech · September 24, 2018

5 hours ago, edgedog said:

Is this issue on your radar and being worked? Is there anything we can provide you to help you do so? Thanks!

Yes, you should already know we need diagnostics.zip not just a syslog snippet which barely helps to troubleshoot anything.

Also it appears you are running Unraid in an ESXi virtual machine. We cannot reproduce this exact issue because we cannot duplicate your exact config. That said, it's possible you are running out of memory. This is because NFS uses an archaic concept called "file handles" which is a numeric value that maps to a file, instead of a path. In a lot of file systems this maps to the inode number. In 'shfs' there are no fixed inodes that correspond to files. Instead inodes are generated and kept in memory by FUSE. That "remember=330" mount option tells FUSE to keep these inodes in memory for 5 1/2 minutes. This was chosen because the typical modern NFS client will cache file handles for 5 minutes. If the client asks for I/O on that handle within 5 min and the handle is no longer valid, you get "stale file handle" messages. After 5 min, the client typically uses a path to re-read the file handle. However you can open alot of files in 5 minutes. This is made worse if you have something like 'cache_dirs' plugin running against shfs mount points. Maybe try increasing memory allotted to the VM and/or reduce that 'remember' value.

On the other hand, it could be an entirely different issue, don't have enough info to determine this.

wuntoofwee · September 24, 2018

Same here - docker cpu usage pegs all cores to 100% then all mounts in /mnt/user disappear.
I'd attach a diag, but I rolled back in order to get storage back on line. Apologies.

FYI: rolled back to 6.5.3, using the same containers and it's stable.

Mochaka · September 25, 2018

I'm having the same issue too. In syslog it looks like NFS crashes and then /mnt/user is inaccessible, having to reboot the whole system to get it back. I'm going to downgrade to 6.5 for now.

edgedog · September 25, 2018

21 hours ago, limetech said:

Yes, you should already know we need diagnostics.zip not just a syslog snippet which barely helps to troubleshoot anything.

Also it appears you are running Unraid in an ESXi virtual machine. We cannot reproduce this exact issue because we cannot duplicate your exact config. That said, it's possible you are running out of memory. This is because NFS uses an archaic concept called "file handles" which is a numeric value that maps to a file, instead of a path. In a lot of file systems this maps to the inode number. In 'shfs' there are no fixed inodes that correspond to files. Instead inodes are generated and kept in memory by FUSE. That "remember=330" mount option tells FUSE to keep these inodes in memory for 5 1/2 minutes. This was chosen because the typical modern NFS client will cache file handles for 5 minutes. If the client asks for I/O on that handle within 5 min and the handle is no longer valid, you get "stale file handle" messages. After 5 min, the client typically uses a path to re-read the file handle. However you can open alot of files in 5 minutes. This is made worse if you have something like 'cache_dirs' plugin running against shfs mount points. Maybe try increasing memory allotted to the VM and/or reduce that 'remember' value.

On the other hand, it could be an entirely different issue, don't have enough info to determine this.

Yes sir. I submitted my non-anonymized diagnostics.zip through the unraid GUI's feedback/bug report feature on 9/20/2018 a little after 11am UTC. I haven't heard from anyone regarding that submission so that was probably the incorrect way to submit it. I'm sorry for my ignorance. If there's a better way to get you the info, please let me know.

Thanks for the information about how nfs and shfs works. At the time of the diagnostics.zip, I was booted in safe mode and my unraid vm had 16GB of RAM allocated with 13GB of that available for use. I had subsequently increased the VM RAM to 40GB for test purposes and continued to experience the crashes. I don't believe there's a lack of memory unless nfsd or shfs is unable to acquire available memory for some reason. But I'm definitely willing to test your theory by modifying the remember parameter of the shfs process. Where is the file that I should modify that sets that parameter? I've scoured the filesystem but have been unable to find it.

Thanks a bunch for responding!

nekromantik · September 25, 2018

im also seeing this issue.

Same error in syslog as @edgedog

nfs was fine in my setup on 6.5.3 never seen error like this.

Will downgrade for now.

/mnt/user dissappeared

User Feedback

Recommended Comments

Can0n 37

Link to comment

Can0n 37

Link to comment

Can0n 37

Link to comment

Can0n 37

Link to comment

bonienl 1767

Link to comment

testdasi 500

Link to comment

limetech 3328

Link to comment

Can0n 37

Link to comment

JonathanM 2306

Link to comment

Can0n 37

Link to comment

Can0n 37

Link to comment

JonathanM 2306

Link to comment

Can0n 37

Link to comment

itimpi 2246

Link to comment

Can0n 37

Link to comment

Can0n 37

Link to comment

limetech 3328

Link to comment

Can0n 37

Link to comment

Can0n 37

Link to comment

edgedog 1

Link to comment

limetech 3328

Link to comment

wuntoofwee 0

Link to comment

Mochaka 1

Link to comment

edgedog 1

Link to comment

nekromantik 4

Link to comment

Join the conversation