Jump to content

cache_dirs - an attempt to keep directory entries in RAM to prevent disk spin-up


Recommended Posts

As I mentioned in a prior post, It can help alleviate the NFS stale file handle issue if you have a user share mounted over NFS.  So I would lobby that it stays with a description on recommendations.

 

I wouldn't even bother posting this except this usage case was something that Tom mentioned to me.

Link to comment
  • 2 weeks later...
So are you using it with "scan user shares" set to yes or no then?

 

I was ... on V4 and V5

 

Now I'm on V6, I haven't spent time to get it working, but user shares are now taking a very long time on first access.  So, shfs may be a memory-resident file system, but memory caching definitely makes a great difference .... oh, and I use nfs exclusively.

Link to comment

I am curious, whilst cache_dirs is arch agnostic there was some talk about optimizing it for 64bit architectures removal of low memory restrictions.

 

Is this still the case? Is there anything else we can tweak here to extract the maximum benefit form unRAID 6?

Link to comment

Finally bit the bullet and moved my main array to the 64bit 6 series beta.

 

cache_dirs is awful when you start the daemon it causes the whole webgui to freeze for a period longer than i was prepared to wait to see if it came back.

 

I have a vague recollection this was solved somewhere in some thread?

Link to comment

Some findings on Cache_dirs ( little bit troubling).

on V5 , via Top, caching both my movies and series media ( yes there is a huge amount(11TB) of that..).

On V5, top reports

ext17188.png

Full Top Capture (V5) -> http://imgbin.org/images/17191.JPG

 

 

On V6 top reports (note about 2Gb from max is used by a domain/VM, but i think dom0 still remains separate)

ext17190.png

 

Full Top Capture ( V6) -> http://imgbin.org/images/17192.JPG

 

Thus V5 caching uses about 1.58GB of Memory total on the V5 unraid box, versus the huge 3.62GB V6 unraid (and V5 has Mysql(+-300Mb) running as well) - that is  over 1.5GB more to cache the same data ....

 

 

I know Cache_dirs is only performing a simple find function, and thus the issue is most probably not with the script, but does anybody else maybe have an idea as to why this could be?

 

I really don't like leaving the server running with only about 50MB of memory free

root@Storage:~# free -m
             total       used       free     shared    buffers     cached
Mem:          3593       3539         53          0        767       1696
-/+ buffers/cache:       1076       2517
Swap:            0          0          0

 

 

Hopefully a linux guru can maybe assist? :) :) - *free beers is available*

 

Ps

V6 Cachedirs also gave an issue when started via go script - not sure if related. had to kill the process id , and restart manually.

Link to comment

It is my understanding that the cache memory is used and released as other needs demand. Ie - it uses only free memory. It doesn't just cache directories but files as well so if you read files on one but not the other, it would skew things.

 

Could be wrong. *shrug*

 

 

The cache area is transient.

It expands up to all useful memory. As other areas of the kernel or applications need memory, cache pages are released.

 

 

The issue between 32 bit and 64 bit is that 32 bit has a finite amount of low memory. as this gets used up, it's not released as easily as the cache. Plus it also can get fragmented.  On 32 bit, adding a swap file can help as can dropping the cache. This may not be needed for 64 bit.

We wont know for a while what the ramifications are of long term high cache_dir usage will be.

My guess would be that 64 bit handles memory management of busy dentries better.

Link to comment

 

Neo_X commendable work. As WeeboTech points out you need to look a bit deeper to get the info you  actaully need.

 

Look at this post for some useful relevant debug commands

 

http://lime-technology.com/forum/index.php?topic=4500.msg286103#msg286103

 

Hi Nas

 

I can make the debug captures no problem (even a comparison after a few hours between V5 / V6 again) , but the very strange part is that i am seeing the reverse - eg V5 is managing the Cache with less memory than what V6 (64bit) is.

 

hmmm

just for fun, i think i will make a capture without XEN as well, just in case

 

regards

 

Neo_X

Link to comment

hey neo_x, interested in this:-

 

Ps

V6 Cachedirs also gave an issue when started via go script - not sure if related. had to kill the process id , and restart manually.

 

i did run cache-dirs via go script, i found it consumed all resources on my server and basically killed dom0 and all domU's after approx 2 days, what issue were you seeing when starting it via go script, high memory consumption, or something else?. i have of course stayed away from running cache-dirs for the time being after that experience, can anybody confirm they have this running stable on v6 and if so what flags they are using.

 

cheers.

binhex.

 

p.s nice to see ya on another board :-)

Link to comment

hey neo_x, interested in this:-

 

Ps

V6 Cachedirs also gave an issue when started via go script - not sure if related. had to kill the process id , and restart manually.

 

i did run cache-dirs via go script, i found it consumed all resources on my server and basically killed dom0 and all domU's after approx 2 days, what issue were you seeing when starting it via go script, high memory consumption, or something else?. i have of course stayed away from running cache-dirs for the time being after that experience, can anybody confirm they have this running stable on v6 and if so what flags they are using.

 

cheers.

binhex.

 

p.s nice to see ya on another board :-)

 

Hi Binhex

 

same here - glad to see you. :)

 

I have one domain running , and upon starting cache_dirs via the go script, and checking top after about 30 mintues to an hour, i saw 100% cpu utilization (most of which was taken up by the cache_dirs script).

Luckily on my end i have dedicated one core to Dom0, which stabilized the system drastically (thus nothing crashed)

my syslinux :

label Xen/unRAID OS
  kernel /syslinux/mboot.c32
  append /xen dom0_max_vcpus=1 dom0_vcpus_pin --- /bzimage --- /bzroot

strange part - upon trying "cache_dirs -q" it reported that it is not running... thus i dad to resort to manually killing the process id

 

i tried adding a sleep 120 in the go script before the cache_dirs line, but it didn't help

 

only option so far was to manually telnet into the box after a few minutes, and run the cache_dirs from the telnet prompt.

 

busy running with stock tests , will check if the same occurs when Xen is not running.

 

 

regards

 

Neo_x

Link to comment

I found that I had to comment out the ulimit line to get the cache_dirs to run reliably on the 64-bit environment.

 

yes i did see that suggestion earlier in this thread and had done that in the script before adding to go file and rebooting, still caused nasty OOM and crash. i take it you are running it now?, if so what flags are you using and are you running this via go file or manually starting after unraid has booted?.

Link to comment

I found that I had to comment out the ulimit line to get the cache_dirs to run reliably on the 64-bit environment.

 

yes i did see that suggestion earlier in this thread and had done that in the script before adding to go file and rebooting, still caused nasty OOM and crash. i take it you are running it now?, if so what flags are you using and are you running this via go file or manually starting after unraid has booted?.

I start mine from the go file using a command line of the form

/boot/cache_dirs -w -B -m 15 -M 30 -d 6 -e 'Archives'

 

Link to comment

 

Neo_X commendable work. As WeeboTech points out you need to look a bit deeper to get the info you  actaully need.

 

Look at this post for some useful relevant debug commands

 

http://lime-technology.com/forum/index.php?topic=4500.msg286103#msg286103

 

 

Hi Nas/ guys

 

as promised, some more data below -which hopefully can assist.

Some observations - strangely not seeing major memory leakage (compared 4 hours and 10 hours), will keep monitoring.

Big difference is that on the same hardware, Memory usage on V5 was 1116MB versus 2777MB on V6(no xen) and 2392MB on V6(xen)..... which is odd at best. Unless 64 bit has more overheads to store the same data?

 

All testing was performed with stock unraid's , capturing before cache_dirs, and after running cache_dirs for about 4 hours.

v6 cache_dirs was modified to comment out the ulimit line

I know some data was repeated unnecessary(eg file counts and sizes), but rather repeated them to make sure nothing slips through :)

 

Character limit - had to attach the captures rather (2 posts)

 

EDIT -As recommended by NAS, pastebin was utilized :)

 

v5_0_5 - no cache_dirs -> http://pastebin.com/cHRDuEy8

 

v5_0_5 - cache_dirs running(4 hours)  -> http://pastebin.com/GPCB9tuB

 

v6b4(no xen) - no cache_dirs -> http://pastebin.com/UP3TQ36w

 

v6b4(no xen) - cache_dirs running(4 hours) -> http://pastebin.com/4RTYZrAW

 

v6b4(xen) - no cache_dirs -> http://pastebin.com/6LXjv40P

 

v6b4(xen) - cache_dirs running(10 hours) -> http://pastebin.com/Zy2esyEZ

 

 

 

 

 

Link to comment

as promised, some more data below -which hopefully can assist.

Some observations - strangely not seeing major memory leakage (compared 4 hours and 10 hours), will keep monitoring.

Big difference is that on the same hardware, Memory usage on V5 was 1116MB versus 2777MB on V6(no xen) and 2392MB on V6(xen)..... which is odd at best. Unless 64 bit has more overheads to store the same data?

I wonder if it is something simpler - such as the fact that 64-bit systems do not have constraint on low memory, so can keep more entries cached in memory due to Linux not being forced to push entries out of the cache.  Does anyone have any idea on how one might investigate this?

Link to comment

Linux memory management is very different than Windows memory management, which is what most of us are used to.  We need to read more closely what WeeboTech is saying: (and I hope we hear more from him)

The cache area is transient.

It expands up to all useful memory. As other areas of the kernel or applications need memory, cache pages are released.

 

The issue between 32 bit and 64 bit is that 32 bit has a finite amount of low memory. as this gets used up, it's not released as easily as the cache. Plus it also can get fragmented.  On 32 bit, adding a swap file can help as can dropping the cache. This may not be needed for 64 bit.

We wont know for a while what the ramifications are of long term high cache_dir usage will be.

My guess would be that 64 bit handles memory management of busy dentries better.

 

We're used to thinking of memory in a capitalistic sense - you need memory, so you get it and now you own it and no one else can use it.  Linux seems more like a hippie sharing economy - you need memory, so you use what you need until someone else needs some of it.  That makes the memory reporting tools somewhat useless.  The top command seems particularly useless because it doesn't even break down what is being used in lowmem as opposed to highmem.  In my experience, the disk caches appeared to be stored in lowmem, which has always appeared to be limited to less than the first gigabyte only, typically 893MB or less.  So comparing top results between 32bit and 64bit releases seems useless.  And throwing more memory at a 32bit system only increased highmem, made no difference at all at alleviating any of the lowmem limitations.

 

I really don't like leaving the server running with only about 50MB of memory free

This statement is correct from a Windows standpoint, but not in the Linux world, since we have no idea how much is actually available if we make a request.  What would be useful is knowing how much can actually be given back by all current memory consumers and pools, but that does not appear to be available by any tools I could find.  Any memory pool must have some minimum requirement, beyond which it cannot give back.

 

Some additional commands that may be useful are 'slabtop', 'slabtop -s c', 'vmstat', and 'vmstat -m'.  They do break down the cache usage into its parts.  Perhaps further research and testing here could use these to find particular numbers that are more indicative of OOM danger?

Link to comment

Some additional commands that may be useful are 'slabtop', 'slabtop -s c', 'vmstat', and 'vmstat -m'.  They do break down the cache usage into its parts.  Perhaps further research and testing here could use these to find particular numbers that are more indicative of OOM danger?

 

In addition the cat /proc/meminfo command may be useful, because it shows low and high memory totals.  It also shows Slab, SReclaimable, and SUnreclaim, which may be a way to see what cache memory can and cannot be reclaimed, on request.  I'd be interested in what testers find with these numbers, especially when comparing 32bit and 64 bit versions, under differing loads.

 

I've added more info to the Console commands wiki page, hopefully to get more users started.

Link to comment
We're used to thinking of memory in a capitalistic sense - you need memory, so you get it and now you own it and no one else can use it.  Linux seems more like a hippie sharing economy - you need memory, so you use what you need until someone else needs some of it.

I love this line, it describes linux memory management with the buffer cache eloquently and in very common terms!!

Link to comment

Join the conversation

You can post now and register later. If you have an account, sign in now to post with your account.
Note: Your post will require moderator approval before it will be visible.

Guest
Reply to this topic...

×   Pasted as rich text.   Restore formatting

  Only 75 emoji are allowed.

×   Your link has been automatically embedded.   Display as a link instead

×   Your previous content has been restored.   Clear editor

×   You cannot paste images directly. Upload or insert images from URL.

×
×
  • Create New...