• Slow SMB performance


    ptr727
    • Minor



    User Feedback

    Recommended Comments



    Thank you for the report and blog post.  It's going to take a while for me to fully digest what you're doing.  There's nothing really 'magic' about the Unraid SMB implementation - we use pretty much the latest releases of Samba.  You can tweak yourself via Network/SMB/SMB extra configuration.

    • Like 1
    Link to comment

    Please try out Unraid 6.8, curently available as RC release.

    It addresses the problem that array read performance drops significantly when an array write operation takes place concurrently.

    Link to comment
    4 hours ago, ptr727 said:

    I have now tested Unraid vs. W2K19 VM, and Ubuntu VM, and now Ubuntu bare metal on the same hardware.

    There is no reason why Unraid should be slower on the cache drive, but the ReadWrite and Write performance is abysmal.

    https://blog.insanegenius.com/2020/02/02/unraid-vs-ubuntu-bare-metal-smb-performance/

     

    You should publish the raw data table instead of drawing it in a graph.

    Just using your graph, the estimated write speed is zero (because the Unraid columns are invisible).

     

    Let's use 1 MB / s just for sensibility.

    My write performance is consistently WAY above 1 MB/s.

    I am fairly certain 1 MB/s performance is a show-stopper for everyone.

     

    In fact with regards to access to cache via SMB, I can get 500MB/s write speed using a simple test of copying a 50GB file from a UD share (NVMe) to cache (NVMe) through SMB on my Windows VM.

     

    I did (only recently) notice that 6.8.2 SMB performance is not as good as 6.7.2 but it's only perceptible with NVMe drives (and presumably also with RAID 0/5/6/10 cache pool).

    It is absolutely irrelevant to SATA-based devices.

     

    So I'm sure there's a bug to fix somewhere but I don't think it's anywhere near the level you are reporting.

    Link to comment

    You are welcome to run a test on your own setup for comparison, I describe my test method.

    By my testing the Unraid numbers really are bad, I attached my latest set of data.

    DiskSpeedResult_Ubuntu_Cache.xlsx

     

    Btw, 500MBps is near 4Gbps, are you running 10Gbps ethernet?

    Edited by ptr727
    Link to comment

    I don't really have much to add other than to mention that I'm also having serious performance regressions post-upgrade.

     

    Link to comment

    I made two tests to see performance. Both server and PC have a 10 Gbps connection

     

    1. Copy 14GB file from array to PC (nvme)

    image.png.3767f78ae7a73d488c04494f3164d603.png

     

    Transfer speeds hover between 240 MB/s and 200 MB/s, which is near the maximum the HDD drive can do

     

    2. Copy (same) 14GB file from PC (nvme) to cache (SSD pool in raid 10)

    image.png.e6820457c033b57036c95ede5417dfca.png

     

    Transfer speeds hover between 840 MB/s and 760 MB/s, which is near the 10 Gb/s link saturation.

    Link to comment

    Based on the OP blog, his command is:

    diskspd -w50 -b512K -F2 -r -o8 -W60 -d120 -Srw -Rtext \\storage\testcache\testfile64g.dat > d:\diskspd_unraid_cache.txt

    Translation:

    • diskspd test 
    • 50% write + 50% read mixed IO (the OP then ran for just read and just write)
    • block size 512K (the OP ran multiple times with different block size)
    • 2 concurrent threads
    • random IO
    • 8 concurrent IO requests per thread (so 16 total)
    • 60 seconds warm-up
    • run test for 120 seconds
    • -Srw is to control caching and write through but I have no clue what r + w does
    • show result in text

     

    I believe his storage assignments are:

    • "Cache" is on a share with cache = only
    • "Mount" is on array (cache = no).
    • W2K19 is on a vdisk on cache = only share

     

    So essentially the OP is stress testing shfs ability to handle 16 concurrent random IO.

     

    What he found is shfs doesn't handle random write quite as well as random read.

     

    That kinda makes sense to me to some extent since read is direct from a single device while write requires shfs to first determine which device to write to, adding latency.

    Latency is always more detrimental to random IO than sequential (and majority of Unraid use case probably is sequential-based).

    Link to comment

    See:

    https://github.com/ptr727/DiskSpeedTest

    https://github.com/Microsoft/diskspd/wiki/Command-line-and-parameters

    -Srw means disable local caching and enable remote write though (try to disable remote caching).

     

    What I found is that Unraid SMB is much worse at mixed readwrite and write compared to Ubuntu on the same exact hardware, where the expectation is a similar performance profile.

     

    Are you speculating that the problem is caused Fuse?

     

    Edited by ptr727
    Link to comment
    Just now, ptr727 said:

    Are you speculating that the problem is caused SSHFS/Fuse?

    You can check if there's a different doing the test on a disk share, user shares are known to add some overhead, some performance degradation is expected.

     

    Enable disk shares (Settings -> Global Share settings) then repeat the test on \\tower\disk1 or \\tower\cache

     

    E.g. this is me doing the same transfer to an user share vs disk share:

     

    imagem.png.2372c37015e6cca06b6b268a5375abc7.pngimagem.png.975989559f05e873700bc6718e3c0303.png

     

     

    Link to comment
    24 minutes ago, ptr727 said:

    Are you speculating that the problem is caused Fuse?

    Yes indeed. I was about to propose the same thing johnnie just proposed above i.e. retest on a disk share to bypass shfs.

    Link to comment

    So, you are absolutely right, a "disk" share's performance is on par with that of Ubuntu.

     

    Can you tell me more about "shfs"?

    As far as I can google shfs was abandoned in 2004, replaced by SSHFS, but I don't understand why a remote ssh filesystem would be used, or are we taking vanilla libfuse as integrated into the kernel?

    UnraidDisk.png

    DiskSpeedResult_Ubuntu_Cache.xlsx

    Link to comment

    Some more googling, and I now assume when you say shfs you are referring to Unraid's fuse filesystem, that happens to be similarly named to better known shfs, https://wiki.archlinux.org/index.php/Shfs.

     

    A few questions and comments:

    - Is unraid's fuse filesystem proprietary, or open source, or GPL and we can request source?

    - For operations hitting just the cache, no parity, no spanning, why the big disparity between read and write for what should be a noop?

    - Logically cache only shares should bypass fuse and go direct to disk, avoiding the performance problem.

    - All appdata usage on cache only will suffer from the same IO write performance problem as observed via SMB. Unless users explicitly change appdata for containers from mnt/user/appdata to mnt/cache/appdata.

     

    Link to comment

    'shfs' is our proprietary union-type FUSE file system implemented entirely in-house.

    Every access via /mnt/user/.. is via this file system, with some exceptions:

    • if one of the loopback file systems, libvirt, docker, exist on /mnt/user we actually de-reference the file (find out what volume it exists on) and use that when mounting the loopback.
    • similarly, vdisk images exsting on /mnt/user/.. are also de-referenced when starting a VM.

    The idea of bypassing shfs when all files of a share are guaranteed to exist on the same volume is on our to-do list and should get incorporated when we release multiple-pool support.  You can use a bind mount to set this up yourself if you want.

     

    We're looking into the recent SMB slowdown this week to see what changed.

    Link to comment

    Thank you for the info.

     

    Would it then be accurate to say the read/write and write performance problem shown in the ongoing SMB test results are caused by shfs?

     

    Can you comment on why the write performance is so massively impacted compared to read, especially since the target is the cache and needs no parity computations on write, i.e. can be read through and write through?

    Link to comment
    3 minutes ago, ptr727 said:

    Can you comment on why the write performance is so massively impacted compared to read

    Shouldn't be.

    Link to comment

    Thanks very much for the explanation. I didn't realise unRAID was smart enough to de-reference files like that. I actually just changed all my vdisks paths to /mnt/cache rather than /mnt/user because of some reports of FUSE slowing down certain things.

     

    So to be clear: When creating vdisks in a cache-only directory, we can use /mnt/user, and the vdisks will be mounted under /mnt/cache?

    Link to comment
    1 hour ago, -Daedalus said:

    So to be clear: When creating vdisks in a cache-only directory, we can use /mnt/user, and the vdisks will be mounted under /mnt/cache?

    Correct.  Actually the actual disk path of the vdisk is passed to qemu.

    Link to comment

    Perfect!

     

    If needed I can raise a specific feature request for it, but I think it would be something useful to have in the help text under vdisk when creating a VM.

    Link to comment

    Please add this line in the "Samba extra configuration", see Settings -> SMB -> SMB Extras

    case sensitive = true

    And let me know if this makes any difference.

    Link to comment

    Ok, but why would a SMB option make a difference if it looks as if it is a "shfs" write problem, i.e. SMB over disk performance was good, SMB over user share performance was bad, read performance always good?

     

    I'll give it a try (case sensitive SMB will break Windows), but I won't be able to test until next week.

     

    I believe it should be easy to reproduce the results using the tool I've written, so I would suggest you profile the code yourself, rather than wait for my feedback to the experiments.

    Link to comment



    Join the conversation

    You can post now and register later. If you have an account, sign in now to post with your account.
    Note: Your post will require moderator approval before it will be visible.

    Guest
    Add a comment...

    ×   Pasted as rich text.   Restore formatting

      Only 75 emoji are allowed.

    ×   Your link has been automatically embedded.   Display as a link instead

    ×   Your previous content has been restored.   Clear editor

    ×   You cannot paste images directly. Upload or insert images from URL.


  • Status Definitions

     

    Open = Under consideration.

     

    Solved = The issue has been resolved.

     

    Solved version = The issue has been resolved in the indicated release version.

     

    Closed = Feedback or opinion better posted on our forum for discussion. Also for reports we cannot reproduce or need more information. In this case just add a comment and we will review it again.

     

    Retest = Please retest in latest release.


    Priority Definitions

     

    Minor = Something not working correctly.

     

    Urgent = Server crash, data loss, or other showstopper.

     

    Annoyance = Doesn't affect functionality but should be fixed.

     

    Other = Announcement or other non-issue.