acropora Posted November 16, 2017 Share Posted November 16, 2017 Hey guys, I've had some issues where my system starts to lag for long periods of time. I can typically ping my server when it happens and it seems fine but when i try to go to the home page or go to any of the dockers, it's basically unresponsive/very slow. I can eventually get there though so it doesn't seem as if has crashed. I opened up Stats for the last day and clearly something extreme is happening during this time. The issue has resolved itself and Can someone help me diagnose what the issue is? unraid-diagnostics-20171115-2104.zip Quote Link to comment
JorgeB Posted November 16, 2017 Share Posted November 16, 2017 You're getting constant page allocation stalls, some of them lasting over 80 seconds, this usually helps: https://forums.lime-technology.com/topic/55910-out-of-memory-errors-detected-on-your-server-call-traces-found-on-your-server/?do=findComment&comment=546982 Quote Link to comment
acropora Posted November 16, 2017 Author Share Posted November 16, 2017 9 hours ago, johnnie.black said: You're getting constant page allocation stalls, some of them lasting over 80 seconds, this usually helps: https://forums.lime-technology.com/topic/55910-out-of-memory-errors-detected-on-your-server-call-traces-found-on-your-server/?do=findComment&comment=546982 Thanks for your response. I think your link is dead but I found the page and your comment. I will try and change the values to 5 and 10 acordingly. Quote Link to comment
JorgeB Posted November 16, 2017 Share Posted November 16, 2017 2 minutes ago, acropora said: Thanks for your response. I think your link is dead but I found the page and your comment Yes, somehow the forum software borked the link: 3 minutes ago, acropora said: I will try and change the values to 5 and 10 acordingly. I suggest using 1 and 2. Quote Link to comment
acropora Posted November 16, 2017 Author Share Posted November 16, 2017 Just now, johnnie.black said: Yes, somehow the forum software borked the link: I suggest using 1 and 2. Got it. I tried reading this page (https://lonesysadmin.net/2013/12/22/better-linux-disk-caching-performance-vm-dirty_ratio/) and it's still unclear to me exactly what is happening. By putting it to 1 and 2, it will just flush more often, correct? I'm assuming this I/O issue usually occurs because my downloaders are often running a lot thus my cache is full of data (but some of it gets stored in RAM?) and once the mover moves, something is happening so it gets bogged? Excuse my ignorance, just trying to put it together. Quote Link to comment
JorgeB Posted November 16, 2017 Share Posted November 16, 2017 2 minutes ago, acropora said: it will just flush more often, correct? Yes, it will start flushing when it hits 1% free RAM, instead of the default 10%, and won't use more than 2% RAM, instead of the default 20%. The problem appears to occur after some time when the RAM gets fragmented resulting in allocations stalls or worse OOM errors, keeping the cached RAM as low as possible seems to fix, or at least considerably help with this problem. Quote Link to comment
Frank1940 Posted November 16, 2017 Share Posted November 16, 2017 I have a feeling that the entire 2% (or 20%) is allocated at startup as a contiguous block and the OS won't release any of it. As I recall when I was investigating this a while back, there is a third parameter involved. A maximum time that a write can be delayed (think of it was limiting how long data is permitted to age before a write of that data is forced) before it will start to be written regardless of the percentage of RAM being used. (This variable is not presently available for unRAID users to tune. It may be tuneable from the command line.) So this is another factor that many unRAID users don't even realize is involved as to how much memory is really required. You can construct a number of scenarios in trying to figure out what would be an optima size to allocate but I would bet it does make any significant difference in performance for the average unRAID user unless someone is running a transactional data base (like a seat reservation system for a major airline) on the server. I can remember when this delayed write-to-disk was introduced in Windows (either 3.1 or Win95), a lot of us turned it off as Windows OS and the hardware was unstable enough that we did not want to have a write to a file on a disk that was half in memory and half on the disk by a conscious choice when the system crashed for whatever reason! (If you only had one crash a day, you were lucky!) Quote Link to comment
Recommended Posts
Join the conversation
You can post now and register later. If you have an account, sign in now to post with your account.
Note: Your post will require moderator approval before it will be visible.