Debian Linux machine(s) losing SMB/CIFS connection to unraid share (cache yes) when transfering a file to it.

enJOyIT · January 27, 2022

Hi,

a really strange problem...

I have a plenty of debian linux vm machines running on proxmox which are connected to several unraid user shares. They keep unexpected losing the connection to these shares. Not every share is involved at the same time, this means sometimes share1 is losing the connection and the other time share2 is involved. I can't tell when it happens... Yesterday every machine was connected to every share and today one machine loses connection to on share. The other unraid-shares of this machine are ok as well the other machines!

I moved to unraid from openmediavault and NEVER had such issues, so I pressume it's related to unraid.

If this happens the mounted folder isn't readable anymore and I get "Cannot read file" and in midnight commander the folder looks like "?serien" or "?filme".

I'm mounting the shares via etc/fstab:

Quote

//192.168.20.215/filme /mnt/filme cifs x-systemd.automount,username=plex,password=xxxxxxxx 0 0
//192.168.20.215/serien /mnt/serien cifs x-systemd.automount,username=plex,password=xxxxxxx 0 0
//192.168.20.215/musik /mnt/musik cifs x-systemd.automount,username=plex,password=xxxxxxx 0 0

Maybe there is something what I am missing?

Edited January 28, 2022 by enJOyIT

enJOyIT · January 28, 2022

Today, it happend again:

image.png.9f7ea52ddaca262dc45560984a10aece.png

No Logs in my Client?!

It just dropped the connection. I transfered some files from disk to disk (both in array) minutes before (but nothing related to "serien"). Some files of "serien" were transfered to the cache dir in the same time. Is this related to that?!

The only info from this timeframe i have, is from the unraid-server:

Jan 28 06:04:27 unraid kernel: mdcmd (49): set md_num_stripes 1280
Jan 28 06:04:27 unraid kernel: mdcmd (50): set md_queue_limit 80
Jan 28 06:04:27 unraid kernel: mdcmd (51): set md_sync_limit 5
Jan 28 06:04:27 unraid kernel: mdcmd (52): set md_write_method
Jan 28 06:09:52 unraid smbd[23687]: [2022/01/28 06:09:52.014082,  0] ../../source3/smbd/smb2_read.c:255(smb2_sendfile_send_data)
Jan 28 06:09:52 unraid smbd[23687]:   smb2_sendfile_send_data: sendfile failed for file xxxx/xxxxx/yyyyyyyy.zzz (Connection reset by peer) for client ipv4:192.168.20.221:36518. Terminating

client ipv4:192.168.20.221 is not the client pc with the failed connection at the top, it's another! But it has the same error:

image.png.5fc4a156b831df5d7886d516b71cb396.png

strange!

Edited January 28, 2022 by enJOyIT

enJOyIT · January 28, 2022

Additional info:

I reconnected the shares and then startet the mover to move some date from cache to the array and bam, the next share dropped:

Unbenannt.png.46e934d97023d66c70ce3bb2ad183391.png

But no logs... on client or unraid server

It must have something todo with the mover. As I reconnected (umount "filme" and mount -a) the share (which worked), the mover was still going to move files... and in the next minute "serien" dropped...

"mount" on the client gives me this:

//192.168.20.215/musik on /mnt/musik type cifs (rw,relatime,vers=3.1.1,cache=strict,username=plex,uid=0,noforceuid,gid=0,noforcegid,addr=192.168.20.215,file_mode=0755,dir_mode=0755,soft,nounix,serverino,mapposix,rsize=4194304,wsize=4194304,bsize=1048576,echo_interval=60,actimeo=1,x-systemd.automount)
//192.168.20.215/filme on /mnt/filme type cifs (rw,relatime,vers=3.1.1,cache=strict,username=plex,uid=0,noforceuid,gid=0,noforcegid,addr=192.168.20.215,file_mode=0755,dir_mode=0755,soft,nounix,serverino,mapposix,rsize=4194304,wsize=4194304,bsize=1048576,echo_interval=60,actimeo=1,x-systemd.automount)
//192.168.20.215/serien on /mnt/serien type cifs (rw,relatime,vers=3.1.1,cache=strict,username=plex,uid=0,noforceuid,gid=0,noforcegid,addr=192.168.20.215,file_mode=0755,dir_mode=0755,soft,nounix,serverino,mapposix,rsize=4194304,wsize=4194304,bsize=1048576,echo_interval=60,actimeo=1,x-systemd.automount)
r

What is going on here?????

Edited January 28, 2022 by enJOyIT

enJOyIT · January 28, 2022

I can reproduce this error now. It's definitely connected to the mover/cache drive. If I have files on the cache and start the mover, the connection to the share that contains the moved files drops at the end.

It even drops if I just copy files (via Windows) to the mount (cache yes). Maybe it's locking the drive for a second and my linux machine thinks the mount is gone?!?!

Is there any one from limetech who can analyse this behaviour? Attached the diagnostic file etc... But imho there isn't really interesting in it. Because I don't get any entry in logfiles etc...

Similar threads (it's macos but the behaviour is the same):

Same but on reddit:

I need help!

P.S. I'm getting two new SSD drives (these are my cache drives) tomrorow (my current have no trim support with my HBA - LSI SAS3008) - Maybe (but I doubt) it will help?

unraid-diagnostics-20220128-0958.zip

Edited January 28, 2022 by enJOyIT

enJOyIT · January 28, 2022

Sorry for spamming, but I want to keep the posts separated for chronological order...

I enabled mover-logging, but still no helpful information:

Jan 28 11:48:24 unraid emhttpd: shcmd (94072): /usr/local/sbin/mover |& logger &
Jan 28 11:48:24 unraid root: mover: started
Jan 28 11:48:24 unraid move: move: file /mnt/cache/download/nzbget/completed/xxxxxxxxx/yyyyyyyyyy.abc
Jan 28 11:49:19 unraid root: mover: finished

enJOyIT · January 28, 2022

Is there a official support email address? I'm willing to pay for support!

I'm afraid that moving to unraid was a big mistake if i can't get this problem solved

dlandon · January 28, 2022

I think you are running into the stale file handle issue with CIFS mounts. What I think is happening is you are referring to a file on the cache and when it is moved the file handle has changed and you can then no longer access it.

When UD (Unassigned Devices Plugin) mounts a CIFS share, it uses the 'noserverino' parameter that prevents the stale file handle.

Example UD mount command:

/sbin/mount -t 'cifs' -o rw,noserverino,nounix,iocharset=utf8,file_mode=0777,dir_mode=0777,uid=99,gid=100,credentials='/tmp/unassigned.devices/credentials_Public' '//MEDIASERVER/Public' '/mnt/remotes/MEDIASERVER_Public'

Basically what the 'noserverino' does is use a local 'ino' and the not the server 'ino' which can change if the file is moved.

Take a look at all the parameters used here and see if any others would apply to your situation.

Give that a try. Let me know how it goes and we can work on it somemore.

enJOyIT · January 28, 2022

Thank you for your reply!

Meanwhile I was trying the same with NFS-share... But it ended up in the same issue.

Is NFS using the same mechanism?

dlandon · January 28, 2022

If you use NFSv4, you won't see the stale file handles. You will have to use Unraid 6.10rc2 though for NFSv4 support.

enJOyIT · January 28, 2022

Well... it seems that the "noserverino"-parameter does the job. But there is another issue now, i think it's related to this paramter. The unraid-CPU usage is stuck at 20% now...

Is this a expected behaviour?

image.png.72088cb5d6cf879ebe05a9e0d3c8aff7.png

dlandon · January 28, 2022

Glad to hear you got things working.

Samba is a rather cantankerous beast and has been known to cause this sort of behavour. What version of Unraid are you using?

enJOyIT · January 28, 2022

I am using 6.9.2

dlandon · January 28, 2022

There have been some samba changes in 6.10. One of the latest changes is the addition of samba security. Sorry I can't be more specifc. There is so much going by me in development I can't keep track of it all. If you have the stomach for it, you could run the latest 6.10rc2 or wait for 6.10 final and take a look at it then.

For the moment, a wait and see approach might be best,

enJOyIT · January 28, 2022

Ok, I will have a look and maybe upgrading to the RC (depends on the release date of the final version 🙂)

Thank you very much so far!!!

dlandon · January 28, 2022

2 minutes ago, enJOyIT said:

Thank you very much so far!!!

You are very welcome. You're fairly new here, but I think you'll find the support you get on the forum is exceptional. If you ever feel you are not getting what you need, feel free to tag @Squid or myself and we'll be sure to get the right person on the job.

AndrewZ · January 28, 2022

Or email support if the forum isn't given you the answers you need.

Debian Linux machine(s) losing SMB/CIFS connection to unraid share (cache yes) when transfering a file to it.

Recommended Posts

enJOyIT

Link to comment

enJOyIT

Link to comment

enJOyIT

Link to comment

enJOyIT

Link to comment

enJOyIT

Link to comment

enJOyIT

Link to comment

dlandon

Link to comment

enJOyIT

Link to comment

dlandon

Link to comment

enJOyIT

Link to comment

dlandon

Link to comment

enJOyIT

Link to comment

dlandon

Link to comment

enJOyIT

Link to comment

dlandon

Link to comment

AndrewZ

Link to comment

Join the conversation