unRAID unresponsive for 12 hours


acropora

Recommended Posts

Hey guys, I've had some issues where my system starts to lag for long periods of time. I can typically ping my server when it happens and it seems fine but when i try to go to the home page or go to any of the dockers, it's basically unresponsive/very slow. I can eventually get there though so it doesn't seem as if has crashed.  I opened up Stats for the last day and clearly something extreme is happening during this time. The issue has resolved itself and Can someone help me diagnose what the issue is? 

 

 

image.png

unraid-diagnostics-20171115-2104.zip

Link to comment
9 hours ago, johnnie.black said:

You're getting constant page allocation stalls, some of them lasting over 80 seconds, this usually helps:

 

https://forums.lime-technology.com/topic/55910-out-of-memory-errors-detected-on-your-server-call-traces-found-on-your-server/?do=findComment&comment=546982

 

Thanks for your response. I think your link is dead but I found the page and your comment. I will try and change the values to 5 and 10 acordingly. 

 

 

Link to comment
2 minutes ago, acropora said:

Thanks for your response. I think your link is dead but I found the page and your comment

 

Yes, somehow the forum software borked the link:

 

3 minutes ago, acropora said:

I will try and change the values to 5 and 10 acordingly.

 

I suggest using 1 and 2.

 

Link to comment
Just now, johnnie.black said:

 

Yes, somehow the forum software borked the link:

 

 

I suggest using 1 and 2.

 

 

Got it. I tried reading this page (https://lonesysadmin.net/2013/12/22/better-linux-disk-caching-performance-vm-dirty_ratio/) and it's still unclear to me exactly what is happening. By putting it to 1 and 2, it will just flush more often, correct? I'm assuming this I/O issue usually occurs because my downloaders are often running a lot thus my cache is full of data (but some of it gets stored in RAM?) and once the mover moves, something is happening so it gets bogged? Excuse my ignorance, just trying to put it together.

Link to comment
2 minutes ago, acropora said:

it will just flush more often, correct?

 

Yes, it will start flushing when it hits 1% free RAM, instead of the default 10%, and won't use more than 2% RAM, instead of the default 20%.

 

The problem appears to occur after some time when the RAM gets fragmented resulting in allocations stalls or worse OOM errors, keeping the cached RAM as low as possible seems to fix, or at least considerably help with this problem.

Link to comment

I  have a feeling that the entire 2% (or 20%) is allocated at startup as a contiguous block and the OS won't release any of it.  As I recall when I was investigating this a while back, there is a third parameter involved.  A maximum time that a write can be delayed (think of it was limiting how long data is permitted to age before a write of that data is forced)  before it will start to be written regardless of the percentage of RAM being used.  (This variable is not presently available for unRAID users to tune.  It may be tuneable from the command line.)  So this is another factor that many unRAID users don't even realize is involved as to how much memory is really required.  You can construct a number of scenarios in trying to figure out what would be an optima size to allocate but I would bet it does make any significant difference in performance for the average unRAID user unless someone is running a transactional data base  (like a seat reservation system for a major airline) on the server.

 

I can remember when this delayed write-to-disk was introduced in Windows  (either 3.1 or Win95), a lot of us turned it off as Windows OS and the hardware was unstable enough that we did not want to have a write to a file on a disk that was half in memory and half on the disk by a conscious choice when the system crashed for whatever reason!  (If you only had one crash a day, you were lucky!) 

Link to comment

Join the conversation

You can post now and register later. If you have an account, sign in now to post with your account.
Note: Your post will require moderator approval before it will be visible.

Guest
Reply to this topic...

×   Pasted as rich text.   Restore formatting

  Only 75 emoji are allowed.

×   Your link has been automatically embedded.   Display as a link instead

×   Your previous content has been restored.   Clear editor

×   You cannot paste images directly. Upload or insert images from URL.