golli53
-
Posts
81 -
Joined
Content Type
Profiles
Forums
Downloads
Store
Gallery
Bug Reports
Documentation
Landing
Report Comments posted by golli53
-
-
On 2/7/2020 at 1:48 AM, limetech said:
Verified, yes that works.
You only need to add a single line in SMB Extras:
case sensitive = yes
Note: "yes" == "true" and is case insensitive
This fixes the stat issue for very large folders - thanks for your hard work! Unfortunately, SMB is still quite slow - I think the listdir calls are still ~2x slower than with prior versions, despite Hard Links disabled. With the tweaks, my scripts now run instead of stalling, though are still noticebly slower. I'll try to reproduce and compare when I get a chance to try 6.8.2 again. Regardless, thanks for your efforts here.
-
54 minutes ago, limetech said:
Where do all those dummy files get created? That is on cache on disk1..N, or spread among them all?
All on cache (which is 2xSSD RAID1 btrfs for me). Same issue occurs with folder that's on array though (spread across disks). Seems to be SMB issue because I don't see extra lag when calling stat from unRAID shell or through NFS from Linux client.
-
@limetech First of all, thank you for taking the time to dig into this. From my much more limited testing, the issue seems to be a painful one to track down.
I upgraded yesterday and while this tweak solves listdir times, stat times for missing files in large directories is still bugged (observation 2 in the below post):
For convenience, I reproduced in Linux and wrote this simple script in bash:
# unraid cd /mnt/user/myshare mkdir testdir cd testdir touch dummy{000000..200000} # client sudo mkdir /myshare sudo mount -t cifs -o username=guest //192.168.1.100/myshare /myshare while true; do start=$SECONDS; stat /myshare/testdir/does_not_exist > /dev/null 2>&1 ; end=$SECONDS; echo "$((end-start)) "; done
On 6.8.x, each call takes 7-8s (vs 0-1s on previous versions), regardless of hard link support. The time complexity is nonlinear with the number of files (calls go to 15s if I increase the number of files by 50% to 300k).
- 1
-
I don't have a test server for unRAID so can only try out these suggestions on a weekend when I don't need my production environment up and running. For now, I'm going back to 6.6.7 to avoid slow SMB and the concurrent disk problem in 6.7.2.
Also, I think there was something else going on in addition to the 3-4x slower directory listings. Some of my apps would lag for 20 minutes compared to 5 seconds, so I think there were additional SMB performance regressions. I detailed some other slow behavior in the Prerelease thread, but those were just the regressions I happened to notice from debugging the code in a couple of my apps one weekend, so I may have missed others.
-
16 hours ago, limetech said:
FWIW I tried accessing a share on a remote server which has 3088 items in the top-level and it populated windows explorer near instantaneously. This was via WireGuard connection where the remote server has crappy DSL internet access with mere 4Mbits upload. These were all music directories and files and playing them via VLC worked ok, there was a slight pause to read the files but I attribute this to the aforementioned crappy DSL link. Clearly this is not exhibiting the issue being mentioned.
I guess it's the definition of near instantaneously. In my testing over many thousands of calls, I was averaging 2.5s vs 0.7s (for 6.7.2.) for 3k items. When 2 programs are accessing SMB simultaenously, that becomes 5s vs. 1.4s. For 10 programs, 25s vs 7s. I think it's common for services to access SMB shares on a server simultaenously.
-
19 minutes ago, BRiT said:
But your sample code is getting directory listings.
There are two issues and the sample code has a section for each (preceded by a comment header)
Part 1 of the code is getting listings. That seems to be slower on 6.8.0 for all directories and is noticeable to a human on a single call without concurrency starting with a couple thousand files.
Part 2 of the sample code is only calling stat. I can only reproduce this issue for very large directories, but maybe that's because it requires large directories to produce a measurable difference.
- 1
-
7 minutes ago, BRiT said:
Subdir by some designation, like by create or archive date, is required.
That sort of directory even on native WinOS with local drives is going to be excruciatingly painful. The pain starts around the 30K range.
I never call a directory listing in that directory. I only open specific files by naming convention. So, adding subdirs would make things more inefficient because I would have to check for a file in each subdir. My current setup works very fast using a normal Samba server eg 6.7.2 or Ubuntu.
The first issue is a problem for much smaller directories (a few thousand).
- 1
-
19 minutes ago, limetech said:
Are you saying there are 250,000 files in a single directory?
😀Yes it's a very big one for automatically archiving json files. There's no natural categorization for assigning subdirectories, so wouldn't improve the speed for my app.
-
Attaching the testparm outputs. I also tried some debugging after rolling back again to the previous version. I see two differences in behavior
- Each os.listdir call is ~3-4x slower on average on v6.8 vs v6.72 (2.5s vs 0.7s for a 6k file directory)
- When calling os.stat on a single NONEXISTENT file in a 250k file directory on 6.8 (should be microseconds per call), every 100 or so calls, it hangs for ~5s
Concurrent calls are simply additive in terms of execution time and concurrency itself doesn't seem to be the problem, but it makes it more apparent.
Note that under the hood, Python is just using Windows native protocol for listing / accessing stats on these files, but using Python just makes it easier to debug many requests.
My code for reproducing this below:
from datetime import datetime as dt import os # observation 1 while True: start = dt.now() os.listdir('//192.168.1.100/share/path') print((dt.now() - start).total_seconds()) # observation 2 while True: start = dt.now() try: os.stat('//192.168.1.100/share/bigpath/nonexistent.txt') except: pass print((dt.now() - start).total_seconds())
-
5 hours ago, limetech said:
No but Samba team changes defaults all the time from release-to-release (kinda maddening they do this).
Please type this command running 6.7.2:
testparm -sv > /boot/smb672.txt
and then boot 6.8-rc8 and type:
testparm -sv > /boot/smb680.txt
You now have those two text files on your flash which you can post here. Note: those files will contain your share names, if you don't want to post here then send to me via PM.
Also would be helpful to describe to me how to reproduce this issue.
Thanks- I will try as soon as I get a chance (running production environment).
I'm essentially calling code like below on several (~5) network folders with 5k subdiectories each from a Win10 client. That slows SMB directory listing down to a halt for that client, including normal browing using Windows file explorer.
def recursive_ls(path): files = os.listdir(path) for f in files: subpath = os.path.join(path, f) if os.path.isdir(subpath): recursive_ls(subpath)
[edit] Each subdirectory only has a couple files in it, so the number of files may not be an issue, but rather just concurrent ls requests, so calling ls on a loop from multiple threads on 1 client may do the same thing (may be easier to setup using a shell script)
-
Rolled back to 6.7.2 and this goes back to normal. Directory listing times with concurrent requests from same client are back to near instant vs 2-10s
Tried 6.8.0rc8 and same issues as rc7
Sticking with 6.7.2 for now, because the update practically makes my samba client operations unusuable. Unfortunately, have the concurrent write performance issues on the other hand
-
10 hours ago, veruszetec said:
Hey, quick question: Why is User0 considered deprecated?
Is there something I should be using instead to replace this functionality?
I'm interested in this also. I use user0 for several applications that do large background file transfers that I want to skip my cache (in order to avoid filling it up all the time)
- 1
-
On 9/12/2019 at 2:56 PM, yendi said:
Just curious - what is this graph from?
-
On 9/5/2019 at 6:33 AM, GHunter said:
I ran across this problem too and found that is was related to this problem "[6.7.x] Very slow array concurrent performance by Johnnie.Black" that has been reported.
I don't run plex but do have several other dockers that use sqlite and have never had a problem. Appdata is set to "cache only"I also think these two are related. Experiencing both on 6.7.2
-
Does anyone have a sense whether 6.8 will fix this? I am considering downgrading to 6.6.7 as everything freezes during mover (I moved a bunch of mission critical services to an older PC for now), but I started off on 6.7.x, so I would need to downgrade manually and am afraid I'll break something eg plugins/settings.
Slow SMB performance
in Stable Releases
Posted · Edited by golli53
trimmed quote
Which version are you using? I saw a signifcant performance drop starting in 6.8.X, with only partial recovery of performance by modifying Tunable Direct IO and SMB case settings. 6.6.7 at least should be quite a bit faster and doesn't suffer from multistream read/write issues in 6.7.X.