• Slow SMB performance


    ptr727
    • Minor



    User Feedback

    Recommended Comments



    54 minutes ago, ptr727 said:

    Ok, but why would a SMB option make a difference

    We fixed this other bug first and since SMB is path based, if test program is generating a lot of lookups during write for some reason, could explain it

     

    56 minutes ago, ptr727 said:

    I'll give it a try (case sensitive SMB will break Windows),

    Right, I'd be interested in hearing about a specific case, most things work fine.  However, I realize there could be an app that breaks.  Adding the line to SMB Extras is a quick test/fix, ultimately we'll make a per-share config setting for this.

     

    58 minutes ago, ptr727 said:

    I believe it should be easy to reproduce the results using the tool I've written, so I would suggest you profile the code yourself, rather than wait for my feedback to the experiments.

    Which tool?  Please provide link.  Also debugging these kinds of issues is a 2-way street.  This is not the only issue we have to work on.

    Link to comment
    2 minutes ago, limetech said:

    Which tool?  Please provide link.  Also debugging these kinds of issues is a 2-way street.  This is not the only issue we have to work on.

    I have spent a significant amount of time and effort chasing the SMB performance problem (I immediately noticed the slowdown when I switched from W2K16 to Unraid), so I do think my side of the street has been well worn.

    I referenced the tool I wrote to automate the tests in the last three of my blog posts where I detail my troubleshooting, and every one posted in this thread.

    For completeness, here again: https://github.com/ptr727/DiskSpeedTest

    Link to comment
    5 minutes ago, ptr727 said:

    For completeness, here again

    Thanks for the link.  I don't have any windows code development environment and as soon as I find the half a day to learn this and get it running, I'll take a stab.  In meantime I'll do some testing with linux CIFS and see if I can reproduce similar results.

    Link to comment

    I tried with DirectIO yes, and DirectIO yes plus case insensitive yes, no difference (see attached results).

    Given that a disk share over SMB showed good performance, I am sceptical that it is a SMB issue, my money is on a performance problem in the shfs write path.

    DiskSpeedResult_Ubuntu_Cache.xlsx

    • Like 1
    Link to comment

    Hi guys, unfortunately I see the same issue with SMB. Here is my results comparing Performance and On Demand CPU profile. Pstates are disabled in config, while running my CPU at max frequency even in idle.


    image.thumb.png.58fdeb338bb5c6be83dbe92533a2065d.png
     

    Here is write speeds once again:
    image.thumb.png.81378591a4babce05b4f6d99e5bb9789.png

     

     

    Currently, this prevents me to switch to unRAID in production. Shouldn't it be the high priority issue? 

     

    My Specs:

    10GbE

    i9-9900

    NVME Cache

    intel_pstate=disable

     

    • Like 3
    Link to comment
    On 4/1/2020 at 4:02 PM, rhard said:

    Hi guys, unfortunately I see the same issue with SMB. Here is my results comparing Performance and On Demand CPU profile. Pstates are disabled in config, while running my CPU at max frequency even in idle.

    Which version are you using? I saw a signifcant performance drop starting in 6.8.X, with only partial recovery of performance by modifying Tunable Direct IO and SMB case settings. 6.6.7 at least should be quite a bit faster and doesn't suffer from multistream read/write issues in 6.7.X.

    Edited by golli53
    trimmed quote
    Link to comment

    I just started with unRAID. I am on the latest stable which is 6.8.3. Today I will test SMB on cache disk bypassing unRAID file system. 

    Link to comment

    Here is the results comparing User Share vs DiskShare + Direct IO:
    image.thumb.png.566d5254db5cdacb677ce5509ace106c.png

     

    Write speeds:

    image.thumb.png.f68c338a349a24108474d3bd47a91be4.png

     

    Have no idea what does it mean... 

    Link to comment

    I never used a version lower than 6.8.3 so I'm not able to compare, but the speed through SMB is super slow compared to NFS or FTP:

     

    @bonienl

    You made two tests and in the first one you were able to download from one HDD with 205 MB/s. Wow, I never reach trough SMB > 110 MB/s after enabling the parity disk! Do you have one? Are you sure you used the HDDs? A re-downloaded file comes from the SMB cache (RAM), but then 205 MB/s would be really slow (re-downloading a file from my Unraid server hits 700 MB/s through SMB).

     

    In your second test you reached 760 MB/s on your RAID10 SSD pool and you think this value is good? With your setup you should easily reach more than 1 GB/s!

     

    With my old Synology NAS I downloaded with 1 GB/s without problems (depending on the physical location of the data on the hdd plattern), especially if the file was cached in the RAM. This review shows the performance of my old NAS. And it does not use SSDs at all!

     

    I tested my SSD cache (a single NVMe) on my Unraid server and its really slow (compared to the 10G setup and the constant SSD performance):

     

    FTP Download:

    758190852_2020-06-2002_58_10.png.3d87a911ae9adfa868b3ce33b4938f7e.png

     

    FTP Upload:

    1191695030_2020-06-2003_18_34.png.04c2bf29df64e10a1c7e1042f5c13cf8.png

     

    A 1TB 970 Evo should easily hit the 10G limits for up- and downloads.

     

    I think there is something really wrong with Unraid. And SMB is even worse.

    Link to comment

    For the time being I gave up on Unraid fixing this, I moved to Proxmox with ZFS: Removed link at the request of @limetech

     

    Edited by ptr727
    Link to comment
    11 hours ago, ptr727 said:

    For the time being I gave up on Unraid fixing this, I moved to Proxmox with ZFS: Removed link at the request of @limetech

     

     

     

    Unfortunately for this and other reasons I also determined that Unraid just wasn't suited for my needs.  I moved to Ubuntu with ZFS and use Looking Glass to simplify my GPU-passthru VM setup.  Although it's far from perfect, it does everything I need.

    Link to comment

    I am also seeing these issues with SMB being stupidly slow with small files.

     

    A weekly backup process that used to take me ~1 hour for the past many years took me an entire day of baby sitting.

     

    I testing with NFS and speeds are as expected, showing all hardware / network / filesystem are ok and the issue is with samba.

     

    If I do the same test to a windows SMB share, speeds are as expected.

     

    I am trying to get FTP working now, and will test that next. If I can't get this performance working better not sure what I will do, I really like unraid. Maybe a docker can share via smb properly?

     

    A VM will share properly via smb but a VM can't access a local share on unraid.

    Link to comment
    1 hour ago, TexasUnraid said:

    I am also seeing these issues with SMB being stupidly slow with small files.

     

    A weekly backup process that used to take me ~1 hour for the past many years took me an entire day of baby sitting.

    Slow transfer of small files over SMB is expected behavior if there are a lot of small files, whether that be Microsoft SMB or *nix and Samba.  But if you're seeing a large disparity between a Windows share and an Unraid share when copying the same files, that definitely indicates a problem.  Though I'd be curious what the disparity is between the two if you copy a single large file for testing.

     

    In the past I had seen similar slowness using Samba on *nix machines and the usual fix was to modify the Samba config to enable the socket option "TCP_NODELAY".  I have not tried this myself on Unraid as I have not seen slow throughput over SMB, though I typically do large image backups and don't copy a large number of small files.  I see close to a full gigabit (the client NIC speed) across my network when doing image backups using Macrium Reflect to an SMB share I have on my Unraid server.

     

    https://www.samba.org/~ab/output/htmldocs/Samba3-HOWTO/speed.html

    Link to comment

    Yeah, I am used to the slowdown with small files on windows but unraid takes it to another level.

     

    I have a test folder with ~8k small files mostly between 10kb-100kb I have been messing with.

     

    Copying it to a windows machine takes ~40 seconds

     

    Coping it to unraid raid0 cache takes over 6 minutes.

     

    Multiply that by 100x and you can see how it would become a real problem.

     

    iperf / large files easily max out a gigabit connection to both unraid and windows. My 10gb P2P connection is also able to manage ~600-700mb/s

     

    I tried the TCP_NODELAY option at one point and did not notice any big changes but maybe I needed to restart for it to take effect?

     

    The fact that NFS works fine would say the issues is something to do with samba though.

     

    The best I have come up with at this point is using a script to enable NFS during automated backups and then disable it again since it is a gaping security hole. Not a good option but I can't think of another option.

    Edited by TexasUnraid
    Link to comment

    Yeah, that's a pretty big disparity.  Are you using a *nix OS on the client, or is it a Windows machine?  What kind of NIC is in your server?  I'd be curious to see what a packet capture shows if you reproduce the slowness.  I will try to reproduce some results in my lab to see if I can recreate the issue as well.

     

    Just food for thought, I'd be sure your Samba config isn't forcing use of SMB1.  Years ago we had an issue at work where users would complain of super long logon times in the legacy Server 2003 farm we were migrating off of.  Turned out that a bunch of crap had been roamed in their profiles; thousands of tiny files.  Because Server 2003 only supported SMB1, and even though the SAN was very capable, the servers could only copy the files sequentially and the result was instead of a 30 second logon time, users were seeing hours long logon times in some cases.

    Link to comment

    Clients are all windows, I tried to move to linux years ago but just can't make it work for day to day use, too much use of the terminal for daily use from someone that doesn't speak linux lol.

     

    I am running an intel T340 nic in the sever and an intel onboard nic in most of the clients as well.

     

    I disabled SMB1 actually so it is not that.

     

    I have no idea how to do packet logging on linux, Besides that time I tried to switch to linux, I have only really used linux in VM's before now.

     

    I agree, I would like to see what is happening as well.

    Link to comment

    If you install the Nerd Pack plugin you can install tcpdump.

    image.thumb.png.0733ad679c95341bd3a3e46a335cf5c8.png

     

    From there you can collect a packet capture at the server by going to the command line and running something like 'tcpdump -w /mnt/user/pub/tcpdump.cap -i br0'.  You will need to change the path to whatever folder you want it in, pub just happens to be a share I have setup.  You may also need to change your interface name, though it should be br0.  Hit enter and it will start the capture, then start copying the files to reproduce the slowness.  After maybe 60 seconds or so you can hit CTRL-C to stop the capture.  You can open the file in Wireshark, or, if you're comfortable, you can zip it up and share it here for review.

     

    Alternatively you can do all of this from the client side using Wireshark as well, though, it may require a packet capture from both to more effectively see what is happening.  I'd personally start with the capture at the server as I think this will provide the most valuable data.

     

    This article looks like it has some pretty good info regarding using Wireshark to capture SMB.
    https://thebackroomtech.com/2019/05/22/using-wireshark-to-sniff-an-smb-transmission/

    Link to comment

    Here is a capture of the dummy files going to the server, this is 100 small files with a total size of 2mb, I stopped it there to prevent the capture file from getting too large.

     

    I used my 10gb P2P connection for this capture so that it would only have the file copy packets and I would feel ok uploading it, performance is the same or worse on the 1 gig connections though.

     

    If I do that same copy to a windows machine, it went at least 10x as fast.

     

    Edited by TexasUnraid
    Link to comment

    Can you re-run the capture but this time write the .cap file to a temp folder (i.e. /root) that isn't a share and reproduce the issue?  Then copy the file off after you stop the capture.  This has a lot of extraneous stuff in it that I can't filter very easily.  I'll take a closer look at it later this evening.

    Link to comment
    21 minutes ago, _whatever said:

    Can you re-run the capture but this time write the .cap file to a temp folder (i.e. /root) that isn't a share and reproduce the issue?  Then copy the file off after you stop the capture.  This has a lot of extraneous stuff in it that I can't filter very easily.  I'll take a closer look at it later this evening.

    Ok, here is a new capture. I started the capture with the copy already in progress to hopefully eliminate extraneous stuff before and after the transfer.

     

    I have noticed something interesting though, the first ~300 files copy much faster (although still far behind windows speed) and it then slows to a crawl.

     

    It also gets worse the more I do, for example when re-running this test, it started off slow, then went down to really slow and eventually drops to stupid slow. It stays that way as long as I keep working with files but if it sits for awhile and I come back to it, it will work be faster for the first ~1000-2000 files before falling off a cliff again.

     

    Like some kind of buffer or limit is being reached.

     

    Thanks for the help!

    tcpdump.cap

    Edited by TexasUnraid
    Link to comment

    I see a lot of "STATUS_OBJECT_NAME_NOT_FOUND" in response to trying to create files, and the client then sending successive create requests.  I'm not extremely well versed in Wireshark or SMB, but, this definitely looks odd.

     

    Can you share your samba config?  I'm curious to see what /etc/smb.con and /boot/config/smb-extra.conf have in them on your server.

    Link to comment

    SMB_extras, I have tried disabling recycle bin with the same results.

    #vfs_recycle_start
    #Recycle bin configuration
    [global]
       syslog only = No
       log level = 0 vfs:0
    #vfs_recycle_end
    
    #unassigned_devices_start
    #Unassigned devices share includes
       include = /tmp/unassigned.devices/smb-settings.conf
    #unassigned_devices_end

    smb.conf, this should be stock, I have not messed with much in the way of custom settings in unraid:

     

    [global]                                                                                                     
            # configurable identification                                                                        
            include = /etc/samba/smb-names.conf                                                                  
                                                                                                                 
            # log stuff only to syslog                                                                           
            log level = 0                                                                                        
            syslog = 0                                                                                           
            syslog only = Yes                                                                                    
                                                                                                                 
            # we don't do printers                                                                               
            show add printer wizard = No                                                                         
            disable spoolss = Yes                                                                                
            load printers = No                                                                                   
            printing = bsd                                                                                       
            printcap name = /dev/null                                                                            
                                                                                                                 
            # misc.                                                                                              
            invalid users = root                                                                                 
            unix extensions = No                                                                                 
            wide links = Yes                                                                                     
            use sendfile = Yes                                                                                   
            aio read size = 0                                                                                    
            aio write size = 4096                                                                                
            allocation roundup size = 4096                                                                       
    
    
            # ease upgrades from Samba 3.6                                                                       
            acl allow execute always = Yes                                                                       
            # permit NTLMv1 authentication                                                                       
            ntlm auth = Yes                                                                                      
                                                                                                                 
            # hook for user-defined samba config                                                                 
            include = /boot/config/smb-extra.conf                                                                
                                                                                                                 
            # auto-configured shares                                                                             
            include = /etc/samba/smb-shares.conf                                                                 

     

    Link to comment



    Join the conversation

    You can post now and register later. If you have an account, sign in now to post with your account.
    Note: Your post will require moderator approval before it will be visible.

    Guest
    Add a comment...

    ×   Pasted as rich text.   Restore formatting

      Only 75 emoji are allowed.

    ×   Your link has been automatically embedded.   Display as a link instead

    ×   Your previous content has been restored.   Clear editor

    ×   You cannot paste images directly. Upload or insert images from URL.


  • Status Definitions

     

    Open = Under consideration.

     

    Solved = The issue has been resolved.

     

    Solved version = The issue has been resolved in the indicated release version.

     

    Closed = Feedback or opinion better posted on our forum for discussion. Also for reports we cannot reproduce or need more information. In this case just add a comment and we will review it again.

     

    Retest = Please retest in latest release.


    Priority Definitions

     

    Minor = Something not working correctly.

     

    Urgent = Server crash, data loss, or other showstopper.

     

    Annoyance = Doesn't affect functionality but should be fixed.

     

    Other = Announcement or other non-issue.