• [6.7.x] Very slow array concurrent performance


    JorgeB
    • Solved Urgent

    Since I can remember Unraid has never been great at simultaneous array disk performance, but it was pretty acceptable, since v6.7 there have been various users complaining for example of very poor performance when running the mover and trying to stream a movie.

     

    I noticed this myself yesterday when I couldn't even start watching an SD video using Kodi just because there were writes going on to a different array disk, and this server doesn't even have a parity drive, so did a quick test on my test server and the problem is easily reproducible and started with the first v6.7 release candidate, rc1.

     

    How to reproduce:

     

    -Server just needs 2 assigned array data devices (no parity needed, but same happens with parity) and one cache device, no encryption, all devices are btrfs formatted

    -Used cp to copy a few video files from cache to disk2

    -While cp is going on tried to stream a movie from disk1, took a long time to start and would keep stalling/buffering

     

    Tried to copy one file from disk1 (still while cp is going one on disk2), with V6.6.7:

     

    2083897607_Screenshot2019-08-0511_58_06.png.520373133cc121c80a361538a5fcc99b.png

     

    with v6.7rc1:

     

    856181720_Screenshot2019-08-0511_54_15.png.310bce8dbd6ed80d11d97727de55ac14.png

     

    A few times transfer will go higher for a couple of seconds but most times it's at a few KB/s or completely stalled.

     

    Also tried with all unencrypted xfs formatted devices and it was the same:

     

    1954593604_Screenshot2019-08-0512_21_37.png.6fb39b088e6cc77d99e45b37ea3184d8.png

     

    Server where problem was detected and test server have no hardware in common, one is based on X11 Supermicro board, test server is X9 series, server using HDDs, test server using SSDs so very unlikely to be hardware related.

    • Like 1
    • Upvote 22



    User Feedback

    Recommended Comments



    39 minutes ago, Marshalleq said:

    In other news, yet another random drive read error has occurred overnight.

    Unlikely anything in s/w is going to cause an actual read error reported by the drive.

     

    I know you are pissed off about this  problem but it does nothing for a technical discussion to say, "got another error!".

    Please post your diags.  And if you have already, sorry I didn't look for them, but  can just post a link that says, "these diags show the error I'm getting in the system log".  That's something I can look at.

    • Like 1
    Link to comment
    9 minutes ago, limetech said:

    Unlikely anything in s/w is going to cause an actual read error reported by the drive.

     

    I know you are pissed off about this  problem but it does nothing for a technical discussion to say, "got another error!".

    Please post your diags.  And if you have already, sorry I didn't look for them, but  can just post a link that says, "these diags show the error I'm getting in the system log".  That's something I can look at.

    Hey, no I'm not pissed off or angry or anything, I apologise if I came across too strongly, it's hard to get the balance right in a text based forum - nevertheless I think I've been heard now! :D

     

    My logic was that if I don't say anything, then everyone will think it stopped happening.  Also, by posting - if it's happening to anyone else they'll chime in - (one person did), but given it's only one I'm happy to wait until 6.8.  Yeah I did post diags in another thread some time back, but as I said will wait for 6.8.

     

    Thanks so much for looking into this for us.

     

    Marshalleq.

    Edited by Marshalleq
    Link to comment
    On 9/21/2019 at 6:01 PM, Michael Woodson said:

    I wish I had found this post sooner. There should be a list of known ongoing problems linked to the Unraid dashboard.

     

    I know Unraid runs very barebones as a company, but this seems like an urgent issue. Indeed, Unraid is not that expensive, but I have come to rely on it more and more. I think it should be priced higher and even have an ongoing monthly fee to fund a higher level of development work.

     

    Therefore, also mark me down as someone willing to pay for a higher level of support and development and premium features. Unraid was born out of a group of people who wanted something better and now is going more mainstream to less technically savvy people. This issue has been extremely troublesome to me, and I have spent so much time on it that I will donate $100 to whoever proves they find the fix.

    A extra support tier and a issue list are features i´m interested in also.

    Link to comment

    It looks like I will be rolling back to 6.6.7 tomorrow.  After watching DVR recordings consistently fail when another one is already in progress, I have no choice but to roll back as I have a few overlapping records I need to do this weekend.

     

    What is obviously happening is that when a record/write is in progress, unRAID 6.7.x is so tied up with that process that when a new recording wants to start, the read to determine if there is enough space to record times out.  Plex DVR then fails to record because it claims disk space is too low (I have plenty of free space).  I have seen this pattern repeated many times with 6.7.x.

     

    It also takes FOREVER to open and view photos on the array whenever a record/other heavy write is going on.

     

    Fortunately, I do not have the SQLite DB corruption issues, but, it's back to 6.6.7 for me for the reasons stated above. That version seems to be free of these issues.

     

    I held out as long as possible and eagerly await a 6.8 RC to test to determine if the problem has been resolved.

     

    Thanks to @limetech for continuing to dig into these thorny issues.  They are obviously difficult to resolve.

    Edited by Hoopster
    • Like 1
    Link to comment
    15 minutes ago, Hoopster said:

    I held out as long as possible and eagerly await a 6.8 RC to test to determine if the problem has been resolved.

    The problem is resolved in 6.8.

    I guess the RC release isn't too far away.

    • Like 4
    Link to comment
    10 minutes ago, dgreig said:

    Unfortunately soon tends to be a very long time!

    Right? I'm still waiting for an updated version. In the mean time I can barely stream from Plex while adding new content to the server or running a backup from my PC or something. I must say the folks on this forum have been really helpful when I'm having minor problems but this is a major issue and we don't even have a specific date for a fix besides Soon^TM.

    Edited by shovenose
    Link to comment
    16 minutes ago, dgreig said:

    Unfortunately soon tends to be a very long time!

    Behind the scenes a lot of effort is made to make this work properly without any regression errors. Be patient.

    Edited by bonienl
    Link to comment

    I have been having the same issues myself as noted above but since I do not have anything new to add to the thread I did feel the need to post until today.

    Instead I have been making little adjustments to help improve performance but nothing helped until yesterday when I enabled NCQ support. 

     

    I am now able to stream movies from Plex while performing other read/write operations on the array such as running backups etc without any hiccups. Performance is still not up to par but it has helped allot, not sure why it would though.

    Not saying this is a fix that will work for everyone because I am sure that hardware support would be an important factor but I may be worth a try.

     

     

     

    Link to comment
    1 hour ago, SuperDan said:

    ...nothing helped until yesterday when I enabled NCQ support.

    You got me excited, until I realized I already have NCQ enabled.

    • Haha 1
    Link to comment

    I'm pretty sure theres a discussion earlier in this thread that results in 'you should have NCQ off for unraid'.  Though alongside that saying NCQ might help until this issue is resolved.

    Link to comment
    On 9/27/2019 at 2:28 PM, Hoopster said:

    It looks like I will be rolling back to 6.6.7 tomorrow.  After watching DVR recordings consistently fail when another one is already in progress, I have no choice but to roll back as I have a few overlapping records I need to do this weekend.

    Just in case someone is considering a roll back to 6.6.7 for the reason I stated; PSA: that will not resolve the issue with simultaneous Plex Live TV and DVR recordings IF you are transcoding to RAM and don't have a boatload of available RAM and you are using the relatively new transcode while recording feature.

     

    The roll back to unRAID 6.6.7 did resolve other array performance issues, but, 6.7.x is not responsible for the "not enough space" error I was seeing with Plex DVR.  

     

    Apparently, recent changes to Plex transcoding and the way it handles time shifting have dramatically increased available space requirements.  If you are transcoding in RAM, you could run out quickly.  My 32GB RAM is no longer sufficient to have two simultaneous records of a 1-hour program.  Plex wants 90 minutes worth of transcode space for each hour of record time and it was looking for at least 16GB free (for a 1080i program) when starting a record.  This seems like way more than necessary.  The first one was fine, when the second kicked off a minute later, it wanted 16GB free and found "only" 14.7GB free according to the Plex server logs.

     

    Again, this is a recent change in Plex.  Because of this, I needed to redirect transcodes to an unassigned devices SSD rather than to RAM.  I am not thrilled about more wear and tear on the SSD, but it is my only option since I have "only" 32GB of RAM.

    Edited by Hoopster
    • Thanks 1
    Link to comment

    I have learned that this issue only effects me when my 1GBit connection is maxed out and writing to disk for about 20 seconds after which point every VM and docker grinds to a halt. Then after about another 20 seconds things start to resume and if the network activity is not complete this process repeats. I still have $100 on the table for when this is fixed. When 6.8 is fully released (Not RC) and fixes this issue, where can I donate to unraid?

     

    Thank you,

    Link to comment

    My problem certainly doesn’t have anything to do with hitting 1gbs.  Mine slows even when there is only like 120mbs of traffic.  And not even when transferring over Ethernet.  It’s just internal.

     

    Problem started on 6.7.0

    Link to comment

    Quick question to all the people having issues. Is there anyone with a Ryzen or TR4 system having issues? I am on a first gen TR4 and never had any problems. Are only Intel systems affected by this, maybe because of the Spectre Meltdown mitigations??? Just an idea.

    Link to comment
    1 hour ago, bastl said:

    Quick question to all the people having issues. Is there anyone with a Ryzen or TR4 system having issues? I am on a first gen TR4 and never had any problems. Are only Intel systems affected by this, maybe because of the Spectre Meltdown mitigations??? Just an idea.

    I'm on 3700x and I have the same issue

    Link to comment
    6 hours ago, bastl said:

    Quick question to all the people having issues. Is there anyone with a Ryzen or TR4 system having issues? I am on a first gen TR4 and never had any problems. Are only Intel systems affected by this, maybe because of the Spectre Meltdown mitigations??? Just an idea.

    This issue is not AMD/Intel related. A fix will be available when Unraid 6.8 is released.

    • Like 1
    Link to comment



    Join the conversation

    You can post now and register later. If you have an account, sign in now to post with your account.
    Note: Your post will require moderator approval before it will be visible.

    Guest
    Add a comment...

    ×   Pasted as rich text.   Restore formatting

      Only 75 emoji are allowed.

    ×   Your link has been automatically embedded.   Display as a link instead

    ×   Your previous content has been restored.   Clear editor

    ×   You cannot paste images directly. Upload or insert images from URL.


  • Status Definitions

     

    Open = Under consideration.

     

    Solved = The issue has been resolved.

     

    Solved version = The issue has been resolved in the indicated release version.

     

    Closed = Feedback or opinion better posted on our forum for discussion. Also for reports we cannot reproduce or need more information. In this case just add a comment and we will review it again.

     

    Retest = Please retest in latest release.


    Priority Definitions

     

    Minor = Something not working correctly.

     

    Urgent = Server crash, data loss, or other showstopper.

     

    Annoyance = Doesn't affect functionality but should be fixed.

     

    Other = Announcement or other non-issue.