• Unraid OS version 6.9.0-beta30 available


    limetech

    Changes vs. 6.9.0-beta29 include:

     

    Added workaround for mpt3sas not recognizing devices with certain LSI chipsets. We created this file:

    /etc/modprobe.d/mpt3sas-workaround.conf

    which contains this line:

    options mpt3sas max_queue_depth=10000

    When the mpt3sas module is loaded at boot, that option will be specified.  If you add "mpt3sas.max_queue_depth=10000" to syslinux kernel append line, you can remove it.  Likewise, if you manually load the module via 'go' file, can also remove it.  When/if the mpt3sas maintainer fixes the core issue in the driver we'll get rid of this workaround.

     

    Reverted libvirt to v6.5.0 in order to restore storage device passthrough to VM's.

     

    A handful of other bug fixes, including 'unblacklisting' the ast driver (Aspeed GPU driver).  For those using that on-board graphics chips, primarily Supermicro, this should increase speed and resolution of local console webGUI. 

     


     

    Version 6.9.0-beta30 2020-10-05 (vs -beta29)

    Base distro:

    • libvirt: version 6.5.0 [revert from version 6.6.0]
    • php: version 7.4.11 (CVE-2020-7070, CVE-2020-7069)

    Linux kernel:

    • version 5.8.13
    • ast: removed blacklisting from /etc/modprobe.d
    • mpt3sas: added /etc/modprobe.d/mpt3sas-workaround.conf to set "max_queue_depth=10000"

    Management:

    • at: suppress session open/close syslog messages
    • emhttpd: correct 'Erase' logic for unRAID array devices
    • emhtppd: wipefs encrypted device removed from multi-device pool
    • emhttpd: yet another btrfs 'free/used' calculation method
    • webGUI: Update statuscheck
    • webGUI: Fix dockerupdate.php warnings

     

    • Like 5
    • Thanks 5



    User Feedback

    Recommended Comments



    Is the shfs direct_io option still relevant? I don't see it in the log when enable and it appears to make no difference on or off.

    Link to comment
    3 hours ago, JorgeB said:

    Is the shfs direct_io option still relevant? I don't see it in the log when enable and it appears to make no difference on or off.

    It's passed differently in FUSE3, no longer as a mount option, but shfs will set that config bit when init happens as a result of mount.  We added that option back when we were using FUSE2 because it helped performance with servers that had 10Gbit Ethernet.  FUSE3 added several I/O improvements, among them eliminating the kernel->user->kernel data copies, so it doesn't surprise me if direct_io on/off makes no difference.

    • Thanks 1
    Link to comment
    On 10/11/2020 at 5:10 AM, JorgeB said:

    Did some Samba aio enable/disable tests,

    Thank you very much for this set of numbers.  Here are my observations:

    • Probably samba aio is not enabled at all even if 'aio read size' or 'aio write size' is non-zero.  It's very possible the differences you are seeing is just "noise".  According to 'man smb.conf' there are a number of preconditions which must exists for it to be active, probably something not met?
    • Not surprised by dismal shfs performance of 25K small files.  The union introduces lots of overhead to ensure consistent metadata.  Maybe some checks can be relaxed - I'll look into that.

    For next release I put this into /etc/samba/smb.conf:

            # disable aio by default
            aio read size = 0
            aio write size = 0

    As you observed, it's easy to re-enable via config/smb-extra.conf file.

    • Thanks 1
    Link to comment
    6 minutes ago, limetech said:

    It's very possible the differences you are seeing is just "noise".  According to 'man smb.conf' there are a number of preconditions which must exists for it to be active, probably something not met?

    Maybe, but the btrfs issue I can reproduce as many times as I want (with -beta1 or earlier) just by changing those values, that suggests it's working no? Or at least it's changing something...

     

     

    Link to comment
    4 minutes ago, JorgeB said:

    Maybe, but the btrfs issue I can reproduce as many times as I want (with -beta1 or earlier) just by changing those values, that suggests it's working no? Or at least it's changing something...

     

     

    True.  Did you see same btrfs issue with 6.8.3? The only difference between those releases was updating the linux kernel (to 5.5.8).  Quite a bit has changed between then and now.

     

    In thinking about what samba 'aio' does, probably no benefit for us at all since for reads, probably would only benefit if multi-channel enabled, which we're not doing yet, or if multiple clients are trying to saturate-read the same server, which is probably very uncommon in typical Unraid OS environments.  For aio writes, as the samba docs explain, also limited usefulness since Linux buffer cache is "asynchronous" already.

    Link to comment
    9 minutes ago, limetech said:

    Not surprised by dismal shfs performance of 25K small files.  The union introduces lots of overhead to ensure consistent metadata.  Maybe some checks can be relaxed - I'll look into that.

    Not really something that affects me, and I would guess most users, since probably most content archived are medium/large media files, but anything you could do would always be good, since according to some quick tests I did this morning, comparing user vs disk shares with small file writes, it appears to be getting worse with each new release, writing to a disk share was about 170% faster with v6.7, 350% faster with v6.8 and I'm now getting 500% faster with -beta30, and the write speed to a disk share has remain about the same, it's the user share writes that keep getting slower.

     

    Link to comment
    6 minutes ago, limetech said:

    Did you see same btrfs issue with 6.8.3?

    Yep, and with v6.9-beta1, it only doesn't happen with recent betas.

    Link to comment
    7 minutes ago, JorgeB said:

    Not really something that affects me, and I would guess most users, since probably most content archived are medium/large media files, but anything you could do would always be good, since according to some quick tests I did this morning, comparing user vs disk shares with small file writes, it appears to be getting worse with each new release, writing to a disk share was about 170% faster with v6.7, 350% faster with v6.8 and I'm now getting 500% faster with -beta30, and the write speed to a disk share has remain about the same, it's the user share writes that keep getting slower.

     

    Just to verify: in your testing, "writing" 25K small files means "creating" and then "writing" 25K small files.  And "reading" 25K small files means reading the file contents back from the 25K small files just created - correct?  And then in-between the write pass and the read pass you did something to flush server buffer cache (such as rebooting) - correct?

    Link to comment
    42 minutes ago, limetech said:

    Just to verify: in your testing, "writing" 25K small files means "creating" and then "writing" 25K small files.  And "reading" 25K small files means reading the file contents back from the 25K small files just created - correct?  And then in-between the write pass and the read pass you did something to flush server buffer cache (such as rebooting) - correct?

    Correct, just stopping and restarting the array appears to be enough to flush cache, at least the results were consistent with repeat runs, some of them after rebooting, when I changed release used.

    43 minutes ago, limetech said:

    Also are all the 25K small files in a single directory?

    Multiple directories, these last tests I did this morning used just part of the small files to make it go faster, the smaller ones which I noticed were the ones causing the more significant slowdown, 11.5k files in 74 folders totaling about 50MB, so very small files, and not something most people would ever use with Unraid, I was just curious after the other results, I also tested with direct_io on/off and there was no difference for almost all the tests, hence my earlier question.

     

    On the other hand, user shares performed very well with large files, always about the same as using disk shares, on all 3 releases I tested (v6.7.2, v6.8.3 and -beta30).

    Link to comment

    Overall I'm quiet pleased with this beta, but the VNC resolutions seem to be very grainy compared to what I recall in previous versions for Linux and Windows VM's. Is this something that has an easy fix? I have tried different resolutions and it doesn't seem to matter. I am selecting QXL as the video driver. I'm also using a VM on a host with a single nvidia GPU and  passing it through (in case that is causing the issues).

    Edited by someguy434
    Link to comment

    I've just discovered that I can't spin up a new Ubuntu VM, not only on my threadripper system, but also on my xeon system.  Anyone else seeing this?  I'm finding the beta quite buggy, hosts needing to be rebooted, GPU's not passing through properly and VM editing causing issues.  Docker seems OK.  Thought it was an AMD thing before.

    skywalker-diagnostics-20201014-1438.zip

    Link to comment
    4 hours ago, Marshalleq said:

    VM editing causing issues

    I found that changing a VM is dropping parts of the network config, VM starts but cannot be accessed except via VNC, I found the same issue with a new VM.  

     

     

    Does the VM fail to start?

    Link to comment
    5 hours ago, Marshalleq said:

    I can't spin up a new Ubuntu VM

    Just installed one recently with -beta30 without issues, Ubuntu 20.04.1 IIRC, default settings.

    Link to comment

    Thanks, so it's not specific to ubuntu and not specific to CPU brand.  Both my systems are SMP, I assume that's not going to be an issue though.

    Link to comment

    I’ve been ruminating on this SAMBA aio issue because the very large read performance difference first reported by @trypowercyclereminded me of an issue I’ve seen before, but I was having trouble finding that post, now I know why, because those forums are gone? I did finally find it in my content:

     

    39923698_rc4post.thumb.PNG.8e301f7fc3363825a3b8852af79b6e15.PNG

     

    And this is the comparison I posted at the time:

    b21_rc4_xfs.thumb.png.b1fbc793d562bedd1542f4e2e140f5c7.png

     

    So I believe I noticed this issue at around the same time aio was introduced in Samba, and at the time disabling smb3 fixed it, now I wonder if it was already the same issue and disabling smb3 was also disabling aio, symptoms are very similar, and the problem wasn’t controller related but device related, some brands/models perform worse than others, so I now did some more tests with –beta30 and different disks.

     

    Ignore the normal max speed difference from brand to brand, I used whatever disks I had at hand, so some disks are older and slower than others, the important part is the aio on/off difference, tested with disk shares so no shfs interference, all connected to the same Intel SATA controller, each test was repeated 3 times to make sure results are consistent, read speed reported by robocopy after transferring the same large file.

     

    imagem.png.a17b2e645dec94d2860f75258adf6811.png

     

     

    I think the results are very clear, and by luck (or bad luck) the tests this past weekend were done with the only disk that now doesn’t show a significant difference, note that I don’t think this is a disk brand issue, but a disk model issue, likely firmware related, possibly worse in older disks?

     

    I know you already plan to leave aio disable, but still one more data point that I believe really confirms it should be left disable.

     

    • Thanks 2
    Link to comment

    Has anyone else also noticed slower performance from cache pools since the partition layout changed in beta 29?  

     

    This is really noticeable for me using applications in docker containers - plex loading thumbnails in the web interface, tautulli loading history, sonarr v3 showing it's witty quotes while it loads are all noticeably much much slower since repartitioning with beta 29 - Sonarr takes about 15 seconds to load now when it was a second or 2 before and I don't remember any noticeable lag loading the plex thumbs or tautulli history before this change.  

     

    At first I thought that the combination of the write amplification issue and several rebalances had finally killed my 2 480gb sandisk ssds (they had 3+years power on time and showing several hundred bad blocks) so I replaced them with new samsung pro drives but haven't seen any improvement.  I also tried switching from a docker img file to a directory on the share which also doesn't seem to help.

     

     I also noticed that there's a lot of SHFS processes that often are using the most cpu of anything, one has had 48hrs Cpu time on a machine with 6 days uptime

    (filtered htop screenshot attached)

     

    After reading the earlier posts in this thread I was wondering if this might be related.  My app data and docker folders are both on cache only shares if that matters.  Array is single parity with 5x8tb and 5x3tb, cache is 2x500gb sata ssds in btrfs raid 1, I also have a single 3tb hd defined as another pool and an old 128gb ssd as an unassigned device.

     

     

    Screen Shot 2020-10-14 at 16.00.26.png

    Link to comment
    25 minutes ago, atconc said:

    Has anyone else also noticed slower performance from cache pools since the partition layout changed in beta 29?  

    There is something misconfigured, please post diagnostics.zip.

    Link to comment

    Sorry to be a bit off topic here, I found bugs in OVMF_CODE and OVMF_VARS latest stable releases compiled from sources.

    Unraid 6.9 beta 30 is NOT affected (same as for 6.8.3, and most probably previous versions), since it's using an older version than 202005 stable.

    May I know which version of OVMF you are using in 6.9 beta 30 (or in 6.8.3, I think they are the same version) to properly report bugs in tianocore bugtracker?

     

    Thank you

     

    Update: no need, thanks anyway, I found my issue is macos, apparently xcode is not able to compile properly the files; all ok if compiled from a linux environment with gcc.

    Edited by ghost82
    Link to comment

    Hi, issue here on 6.9 30 Beta, webui not accessible anymore since somewhen today ...

    root@AlsServer:~# uptime
     15:56:46 up 8 days,  9:04,  1 user,  load average: 16.96, 16.27, 15.04

    pretty much since the release of the .30 beta, worked like a charm since today ;)

     

    before i try to restart nginx (would be nice to have a proper way todo so on unraid) i try to fetch some diagnostics,

    bus seems like i have no luck with that ...

    root@AlsServer:~# diagnostics
    Starting diagnostics collection... 

    Resting there forever ...

     

    VM's, Dockers, Shares, ... all seem to work fine, only webui is not working anymore

     

    may some hints howto "safe" restart nginx on unraid ? thanks ahead

    Link to comment
    15 minutes ago, alturismo said:

    Hi, issue here on 6.9 30 Beta, webui not accessible anymore since somewhen today ...

    Please capture the system log:

    cat /var/log/syslog /boot/syslog.txt

    That will put syslog.txt in root of usb flash boot device.

    Link to comment
    36 minutes ago, limetech said:

    Please capture the system log:

    
    cat /var/log/syslog /boot/syslog.txt

    That will put syslog.txt in root of usb flash boot device.

    cat didnt work out, always "no such file .." on target

     

    cat /var/log/syslog /boot/syslog.txt

    ....

    Oct 15 16:50:05 AlsServer sshd[8312]: Starting session: shell on pts/0 for root from 88.73.177.160 port 47984 id 0
    cat: /mnt/user/appdata/syslog.txt: No such file or directory
    root@AlsServer:~# 

     

    i copied the syslog file now to be sure ;)

    Link to comment



    Guest
    This is now closed for further comments

  • Status Definitions

     

    Open = Under consideration.

     

    Solved = The issue has been resolved.

     

    Solved version = The issue has been resolved in the indicated release version.

     

    Closed = Feedback or opinion better posted on our forum for discussion. Also for reports we cannot reproduce or need more information. In this case just add a comment and we will review it again.

     

    Retest = Please retest in latest release.


    Priority Definitions

     

    Minor = Something not working correctly.

     

    Urgent = Server crash, data loss, or other showstopper.

     

    Annoyance = Doesn't affect functionality but should be fixed.

     

    Other = Announcement or other non-issue.