• [6.8.0] Slow SHFS file listings


    Vynce
    • Minor

    I use Arq to back up to a Minio docker container running on Unraid. It was working well with Unraid 6.7.2, but larger backups are failing with Unraid 6.8.0. The versions of Arq and Minio in use haven't changed recently.

     

    Arq is failing due to a GET request timeout:

    2019/12/19 00:04:17:767  DETAIL [thread 307] retrying GET /foo/?prefix=713EC506-32A1-4454-A885-19334B4FB242/objects/95&delimiter=/&max-keys=500: Error Domain=NSURLErrorDomain Code=-1001 "The request timed out."

    I reproduced the same request using aws-cli. 3m10s seems excessive for getting a listing of ~1000 files.

    [REQUEST s3.ListObjectsV1] 05:58:00.385
    GET /foo?delimiter=%2F&prefix=713EC506-32A1-4454-A885-19334B4FB242%2Fobjects%2F91&encoding-type=url
    [RESPONSE] [06:01:10.817] [ Duration 3m10.432524s  Dn 93 B  Up 388 KiB ]
    200 OK
    Server: MinIO/RELEASE.2019-10-12T01-39-57Z

    The Minio container has a user share mapping for backend storage. If I perform essentially the same file listing from an Unraid terminal, it's also pretty slow:

    time ls /mnt/user/minio/foo/713EC506-32A1-4454-A885-19334B4FB242/objects/91* | wc -l
    1140
    real    0m24.676s
    user    0m0.242s
    sys     0m0.310s

    If I do the same thing using the disk mount point instead, it's several orders of magnitude faster:

    time ls /mnt/disk3/minio/foo/713EC506-32A1-4454-A885-19334B4FB242/objects/91* | wc -l
    1140
    real    0m0.090s
    user    0m0.069s
    sys     0m0.026s

    There are a lot of files in these folders, but I don't think it's an unreasonable amount (?):

    ls /mnt/disk3/minio/foo/713EC506-32A1-4454-A885-19334B4FB242/objects/ | wc -l
    278844

    Changing the Minio container path mapping to use the disk share instead of the user share works around the issue, but I'll need user shares to span across disks at some point.

     

    I'd prefer not to downgrade to 6.7.2 to gather comparable metrics there, but I can if that would be helpful.

    unraid-diagnostics-20191227-1342.zip




    User Feedback

    Recommended Comments

    Yeah, I think it’s probably a similar (the same?) issue. I decided to post a new report since I see the poor performance right on Unraid itself — no SMB or other network protocols in the loop.

    Link to comment

    I don't see things being nearly as bad but do see sizeable differences from going through the SHFS layer. I do have cache-dirs running so maybe that artificially boosts the direct disk numbers?

     

    On top of that, I notice the slight visual difference when just running the following 2 commands in a putty SSHd window as well. Naturally going through FUSE has it's overhead but I don't think it was ever this drastic.

     

    
    ls /mnt/disk1/Media/TV/*
    
    ls /mnt/user/Media/TV/*
    
    


    
    Dec 15 15:34:22 REAVER cache_dirs: Arguments=-m 11 -M 30 -l off
    Dec 15 15:34:22 REAVER cache_dirs: Max Scan Secs=30, Min Scan Secs=11
    Dec 15 15:34:22 REAVER cache_dirs: Scan Type=adaptive
    Dec 15 15:34:22 REAVER cache_dirs: Min Scan Depth=4
    Dec 15 15:34:22 REAVER cache_dirs: Max Scan Depth=none
    Dec 15 15:34:22 REAVER cache_dirs: Use Command='find -noleaf'
    
     
    
    Dec 15 15:34:22 REAVER cache_dirs: cache_dirs service rc.cachedirs: Started: '/usr/local/emhttp/plugins/dynamix.cache.dirs/scripts/cache_dirs -m 11 -M 30 -l off 2>/dev/null'
    

     

    
    root@REAVER:~# time ls /mnt/user/Media/TV/* | wc -l
    2317
    
    real    0m0.528s
    user    0m0.017s
    sys     0m0.098s
    root@REAVER:~# time ls /mnt/user/Media/TV/* | wc -l
    2317
    
    real    0m0.511s
    user    0m0.011s
    sys     0m0.103s
    root@REAVER:~# time ls /mnt/user/Media/TV/* | wc -l
    2317
    
    real    0m0.520s
    user    0m0.012s
    sys     0m0.107s
    
    root@REAVER:~# time ls /mnt/user/Media/TV/* | wc -l
    2317
    
    real    0m0.689s
    user    0m0.012s
    sys     0m0.111s
    

     

    
    root@REAVER:~# time ls /mnt/disk1/Media/TV/* | wc -l
    2317
    
    real    0m0.013s
    user    0m0.009s
    sys     0m0.009s
    

     

    Even running just the top level directory listing there is a slight difference with far less entries.

    
    root@REAVER:~# time  ls /mnt/user/Media/TV/ | wc -l
    75
    
    real    0m0.025s
    user    0m0.006s
    sys     0m0.007s
    
    
    root@REAVER:~# time  ls /mnt/disk1/Media/TV/ | wc -l
    75
    
    real    0m0.006s
    user    0m0.007s
    sys     0m0.004s
    

    Link to comment

    I wrote a quick script (benchmark_shfs.sh) to benchmark the difference in performance of running "ls -l" on the disk mount points vs the shfs "user share" mount points. Don't run this script unless you read and understand what it's doing. A couple of the variables at the top need to be modified for your system and there's a risk of data loss if any of the paths it uses conflict with existing paths -- there's no error checking or warnings.

     

    1903866282_TimetolistfilesinUnraid6.8.0.png.ab5dfe21c4b1b40da88a2376b9bbe2ec.png

     

    Both look fairly linear, but my problem is with the slope of the shfs line. On my system, user shares start to become pretty unusable around 100K-200K files per folder (30s~60s for "ls -l" to enumerate the files). I have no idea how this compares on other hardware or if it was better on Unraid 6.7.2 -- I suspect the performance did decrease somewhat in 6.8.0 since Arq wasn't timing out before I upgraded.

    Link to comment

    I did a quick spot check with 6.8.1 and SHFS performance is about the same as 6.8.0. Based on the changelog, I wasn't expecting any difference.

     

                        100K files    200K files
    6.8.0 Disk|SHFS:    0.46| 29.34   0.95| 59.69
    6.8.1 Disk|SHFS:    0.47| 26.97   0.93| 57.31

     

    Edited by Vynce
    Link to comment

    Disabling hard link support seems to be slightly faster, but not substantially.

                             100K files    200K files
    6.8.0      Disk|SHFS:    0.46| 29.34   0.95s| 59.69
    6.8.1      Disk|SHFS:    0.47| 26.97   0.93s| 57.31
    6.8.1 NoHL Disk|SHFS:    0.42| 20.77   0.82s| 45.88

     

    Link to comment
    3 hours ago, Vynce said:

    Disabling hard link support seems to be slightly faster, but not substantially.

    Can you add results for 6.7.2 as well?

     

    Using your script this is what I get with hard link support disabled:

    10000 files
    Writing files
    Benchmarking disk: 0.03
    Benchmarking SHFS: 0.15
    
    20000 files
    Writing files
    Benchmarking disk: 0.06
    Benchmarking SHFS: 0.32
    
    30000 files
    Writing files
    Benchmarking disk: 0.08
    Benchmarking SHFS: 0.49
    
    40000 files
    Writing files
    Benchmarking disk: 0.11
    Benchmarking SHFS: 1.05
    
    50000 files
    Writing files
    Benchmarking disk: 0.14
    Benchmarking SHFS: 1.35
    
    60000 files
    Writing files
    Benchmarking disk: 0.16
    Benchmarking SHFS: 1.64
    
    70000 files
    Writing files
    Benchmarking disk: 0.19
    Benchmarking SHFS: 1.93
    
    80000 files
    Writing files
    Benchmarking disk: 0.22
    Benchmarking SHFS: 2.27
    
    90000 files
    Writing files
    Benchmarking disk: 0.25
    Benchmarking SHFS: 2.64
    
    100000 files
    Writing files
    Benchmarking disk: 0.27
    Benchmarking SHFS: 3.02
    
    110000 files
    Writing files
    Benchmarking disk: 0.31
    Benchmarking SHFS: 3.38
    
    120000 files
    Writing files
    Benchmarking disk: 0.34
    Benchmarking SHFS: 3.97
    
    130000 files
    Writing files
    Benchmarking disk: 0.37
    Benchmarking SHFS: 4.56
    
    140000 files
    Writing files
    Benchmarking disk: 0.40
    Benchmarking SHFS: 5.32
    
    150000 files
    Writing files
    Benchmarking disk: 0.42
    Benchmarking SHFS: 5.78
    
    160000 files
    Writing files
    Benchmarking disk: 0.46
    Benchmarking SHFS: 6.54
    
    170000 files
    Writing files
    Benchmarking disk: 0.48
    Benchmarking SHFS: 7.23
    
    180000 files
    Writing files
    Benchmarking disk: 0.51
    Benchmarking SHFS: 8.30
    
    190000 files
    Writing files
    Benchmarking disk: 0.54
    Benchmarking SHFS: 8.83
    
    200000 files
    Writing files
    Benchmarking disk: 0.57
    Benchmarking SHFS: 10.56
    
    210000 files
    Writing files
    Benchmarking disk: 0.60
    Benchmarking SHFS: 10.96
    
    220000 files
    Writing files
    Benchmarking disk: 0.63
    Benchmarking SHFS: 11.64
    
    230000 files
    Writing files
    Benchmarking disk: 0.63
    Benchmarking SHFS: 13.56
    
    240000 files
    Writing files
    Benchmarking disk: 0.66
    Benchmarking SHFS: 14.70
    
    250000 files
    Writing files
    Benchmarking disk: 0.73
    Benchmarking SHFS: 15.26

    Quite a bit different than your results, not sure why.

    Link to comment

    Here are the stats for my system:

                               100K files     200K files
    6.7.2         Disk|SHFS:   0.22|  3.54    0.48|  5.59
    6.8.1 HL Off: Disk|SHFS:   0.23|  4.86    0.46| 13.11
    6.8.1 HL On:  Disk|SHFS:   0.23| 15.64    0.51| 31.47

    Unraid is running on bare metal.

     

    The share it is writing to is restricted to a single drive, so SHFS didn't have to merge content from multiple places, if that makes a difference.

     

    There is a significant slowdown going from 6.7.2 to 6.8.1 as the number of files increases. When Hard Link support is enabled the slowdown becomes extreme.

    Link to comment
    37 minutes ago, ljm42 said:

    The share it is writing to is restricted to a single drive, so SHFS didn't have to merge content from multiple places, if that makes a difference.

    It should make very little difference in this test.

     

    38 minutes ago, ljm42 said:

    When Hard Link support is enabled the slowdown becomes extreme.

    This can be alleviated fairly easily.

     

    39 minutes ago, ljm42 said:

    here is a significant slowdown going from 6.7.2 to 6.8.1 as the number of files increases.

    This is the big question: how significant is significant and how big is the number of files?  A logical guess is that very few users have large enough directories where any of this makes a difference.  (that is between 6.7.2 and non-HL 6.8). 

     

    For the users where this is an issue, yeah that's unfortunate.  The core of the issue has to do with aging cached attribute information in both the kernel FUSE module and the user-space FUSE node table.  This determines how many back/forth transactions there are between kernel space and user space for each file system operation.  The more transactions, the slower it gets.  We can spend many hours trying to further improve this, but in the end, are those hours better spent adding, for example multiple-pools with zfs support?

    Link to comment
    6 hours ago, limetech said:

    This is the big question: how significant is significant and how big is the number of files?  A logical guess is that very few users have large enough directories where any of this makes a difference.  (that is between 6.7.2 and non-HL 6.8). 

    I'm guessing users wouldn't purposefully put 200k files in a single directory, but for the OP the issue is the Minio Arq backup software. Using Unraid as a backup destination seems like a great idea, and a user share would be ideal since it can grow larger than one disk. If storing this many files in a single directory is a common behavior for backup software, it will probably affect quite a few people.

     

    Not sure what to suggest. Maybe the OP can find a way to set a maximum number of files Minio Arq will put in a given directory? Or maybe split the backups up so that they will fit on individual disks without needing to use a user share? Or maybe there is a comparable backup package that organizes its files in a more compatible way?

     

    Edited by ljm42
    Link to comment

    Thanks to both of you for testing. It's interesting that you're seeing much better performance than I am. Both of you got results in the 10~13s range for 200K files with hardlink support disabled. My result is up at 46s! I do see a lot of variance between runs with this benchmark, but all my results come from running it several times and recording the minimum values. My server isn't under much load, but it's also not a fresh setup with no content. None of the user shares currently span multiple disks.

     

    I don't think it's worth it for the Unraid team to spend a ton of time squeezing every ounce of performance out of SHFS for extremely large directories. But I think it would be worth figuring out why my system is so much slower than your "reference" systems.

     

    I'll try rolling back to 6.7.2 this weekend to see what the performance is like there. I'll also test with docker disabled, etc. Let me know if there are other settings worth playing with to see if they make any difference. Also any suggestions on more precise/relevant benchmarks.

     

    I contacted the author of Arq about this issue and he said the "next version" would use a different file organization strategy (current version of Arq on macOS is 5.17.2). Arq currently stores all the backup chunks in a single folder each using a sha1 hash as the filename. I suspect he'll split those up into subdirectories based on the first character or two of the hash (similar to git). As a side note, Minio is just providing an S3-compatible interface to the storage -- it doesn't play much of a role in defining the directory structure -- that's mostly up to Arq in this case.

     

    These are some interesting charts. It would be nice if they published the benchmark source.

    Link to comment

    Oh sorry I got Minio and Arq confused. I updated my note to be more clear to others. 

     

    It sounds like the upcoming changes to Arq will probably solve your issues, which is great!  Hopefully there aren't a lot of other software packages that need to put 200k files in a single directory

     

    2 hours ago, Vynce said:

    I think it would be worth figuring out why my system is so much slower than your "reference" systems.

    My system is six years old but still going strong. I have a Xeon E3-1240 v3 processor on an ASRock E3C226D2I mobo with 16 GB RAM. I was testing on a 4TB Seagate NAS drive plugged into an onboard SATA 3 port. My dockers and a VM were running at the time, but it was not under heavy load.

     

    Interestingly, I also tried it with a 12 TB Seagate Ironwolf drive and performance was slightly worse. Nothing really significant, just a little surprising.

     

     

    Nice job on the script BTW, it helped me understand that while there is overhead to the user share system, it takes some pretty extreme values to make it an issue.

    Edited by ljm42
    Link to comment
    9 hours ago, Vynce said:

    I've copied the benchmark script to a gist at @ljm42's suggestion: https://gist.github.com/Vynce/44f224c2846de5fa4cf1d5b1dcad2dc4. Anyone is welcome to hack on it as they like 😊.

    Thanks for your script. Out of curiosity I made a test too on my main server.

     

    I do not experience any noticable slowdowns when operating the array under 6.8.1, the biggest folder I have has 2000+ items.

    # bash benchmark_shfs.sh
    Unraid 6.8.1
    Hard Link support: no
    
    100000 files
    Writing files
    Benchmarking disk: 0.29
    Benchmarking SHFS: 3.46
    
    200000 files
    Writing files
    Benchmarking disk: 0.59
    Benchmarking SHFS: 9.48
    
    300000 files
    Writing files
    Benchmarking disk: 0.88
    Benchmarking SHFS: 20.75
    # bash benchmark_shfs.sh
    Unraid 6.8.1
    Hard Link support: yes
    
    100000 files
    Writing files
    Benchmarking disk: 0.25
    Benchmarking SHFS: 10.14
    
    200000 files
    Writing files
    Benchmarking disk: 0.50
    Benchmarking SHFS: 22.67
    
    300000 files
    Writing files
    Benchmarking disk: 0.77
    Benchmarking SHFS: 39.73

     

    Link to comment

    I rolled back to 6.7.2 to gather some numbers.

                               100K files    200K files
    6.7.2        Disk|SHFS:    0.43| 16.27   0.85s| 34.70
    6.8.1 HL Off Disk|SHFS:    0.42| 18.94   0.82s| 43.03
    6.8.1 HL On  Disk|SHFS:    0.47| 26.97   0.93s| 57.31

     

    I noticed something very weird. If I disable docker on 6.7.2 and the server is totally idle, doing on ls -l on a directory with 200K files takes ~51s (so it takes longer than if there's some minimal activity). If I open two terminals and run the same command in both at the same time, they both complete in ~11s. There are no disk reads -- this is all cached in RAM. The same thing happens if I kick off two background instances in the same terminal:

    for x in {1..2}; do { time -p /bin/ls -l --color=never /mnt/user/Download/benchmark &>/dev/null; } 2>&1 & done

    The same pattern happens in 6.8.1, but it's a bit slower overall.

     

    I'll also note that backing up to a sparsebundle file using TimeMachine and Carbon Copy Cloner over SMB became substantially slower in Unraid 6.8.0 and 6.8.1. The sparsebundle format creates 8MB band files in a single directory. With a large backup, that gets into the range of 50K-100K files in one directory. I'm not sure if that's part of the problem in this case, but it's another common instance where a lot of files end up in a single directory. It might be more closely related to the SMB slowness reported in these two threads. With 6.7.2 I'm getting a fairly consistent transfer rate of 13MB/s to a sparsebundle over SMB to a user share (wired gigabit ethernet). With 6.8.1 (with hardlink support disabled) I get very intermittent transfer spikes of 2~12MB followed by nothing for a while. Over a long period that's averaging out to ~0.8MB/s so far 🙁.

    Link to comment

    Following @limetech's suggestion of setting "case sensitive = yes" in SMB extras has resolved the sparsebundle slowness issue for me (TimeMachine and Carbon Copy Cloner).

     

    Listing large directories through SHFS is still unusually slow on my server. I'm still using a disk share for Minio as a workaround.

    Link to comment
    10 minutes ago, Vynce said:

    Listing large directories through SHFS is still unusually slow on my server. I'm still using a disk share for Minio as a workaround.

    How large and how slow?

    Link to comment
    55 minutes ago, limetech said:

    How large and how slow?

    That's what this thread/bug report is about 🙂.

     

    In 6.8.1 with hardlink support disabled it takes ~43s to list 200K files on my server. Looking through the comments above, for @bonienl the same task only took ~9.5s, for @ljm42 it took ~13s, and for you it took ~10.5s.

    Link to comment


    Join the conversation

    You can post now and register later. If you have an account, sign in now to post with your account.
    Note: Your post will require moderator approval before it will be visible.

    Guest
    Add a comment...

    ×   Pasted as rich text.   Restore formatting

      Only 75 emoji are allowed.

    ×   Your link has been automatically embedded.   Display as a link instead

    ×   Your previous content has been restored.   Clear editor

    ×   You cannot paste images directly. Upload or insert images from URL.


  • Status Definitions

     

    Open = Under consideration.

     

    Solved = The issue has been resolved.

     

    Solved version = The issue has been resolved in the indicated release version.

     

    Closed = Feedback or opinion better posted on our forum for discussion. Also for reports we cannot reproduce or need more information. In this case just add a comment and we will review it again.

     

    Retest = Please retest in latest release.


    Priority Definitions

     

    Minor = Something not working correctly.

     

    Urgent = Server crash, data loss, or other showstopper.

     

    Annoyance = Doesn't affect functionality but should be fixed.

     

    Other = Announcement or other non-issue.