cache_dirs - an attempt to keep directory entries in RAM to prevent disk spin-up


Recommended Posts

9 hours ago, Fireball3 said:

You should see an update for the plugin, don't you?

Yeah, I did go ahead and updated but as I indicated, its been a long running issue across multiple versions.

53 minutes ago, Fireball3 said:
6 hours ago, themaxxz said:
 
I also still have the same issue with version 2018.11.20

Did you guys "Turn it OFF and ON again" (aka reboot)?

The issue has persisted across multiple reboots.

Link to comment

Have you tried with the multi thread option set to no?

 

Also give it a try without adaptive.

 

And finally start reading here and follow the troubleshooting instructions that I had to do.

Only the ones targetting your issue of course. It's basically providing logs.

Better read through the whole posts first as the instructions and scripts evolved and grab the last ones posted.

If you have logs that can help to narrow down the issue please upload them here.

@Alex R. Berg will take over then, hopefully. :)

Edited by Fireball3
  • Upvote 1
Link to comment

I've released a new version on my 'beta' fork: https://raw.githubusercontent.com/arberg/dynamix/master/unRAIDv6/dynamix.cache.dirs.plg


It fixes the -a option, and adds help information to the plugin page for how to filter dirs

example:

 

-a '-noleaf -name .Recycle.Bin -prune -o -name log -prune -o -name temp -prune -o -name .sync -prune -o -print'

 

Avoid the () of Joe's example. Unfortunately * and " do not work, so we cannot filter for "*Old". The plugin messes up the double-quotes, and the cache_dirs script does not responde correctly even when it receives proper quoted -name "*Old" -prune.

 

I'll push to dynamix, so it'll probably be live in the main in a few days.

 

Link to comment

I noticed a few days ago that cache_dirs was scanning my dirs at depth 5 and each scan took 45s, and touched disks and it had been going on for a long time. A parity scan was executed the day before, I don't know if it was related. The interesting part is that it started quickly recovered to full depth idle disks after I wrote 7 GB to the /tmp drive and deleted it (with my test_free_memory.sh script). Perhaps the writing to disks, caused linux to move some memory cache around. 

 

I'm using a cache pressure of 1.

 

This looks a bit similar to what was happening to you @wgstarksif I'm not mistaken. In that if there's not enough memory free for whatever reason, cache_dirs spams the disks.

Link to comment

So I had updated the plugin after my last post (now running versions 2.2.5/2018.11.18 on unraid 6.5.2) but I didn't reboot unraid. I was gone for a couple of days, come back and its pegged to 100% again on one of my cores. I haven't restarted the process this time, and I will leave it as such in case you have further commands for me to run.

I followed Fireballs post, here are the commands I have seen Alex request he runs with my output (along also with top). The cache_dirs.log file seems to be empty but there are two others I found in there with some data. I am unsure why it ends this morning at 3:41am and there doesn't seem to be anything on the var/log/cache_dirs.log unless its being saved somewhere else and I just need to check tomorrow (Update: So I checked, it didn't create a new file which leads me to believe that its hung).

 

PS. All my drives excluding the parity are currently spun up. Unsure if its related to this or not, or maybe its emby doing its scans running up the drives as the cache might not be working.

pstree -p | grep -2 cache_dirs
        |-avahi-daemon(4185)---avahi-daemon(4186)
        |-avahi-dnsconfd(4195)
        |-cache_dirs(9820)---cache_dirs(3574)---cache_dirs(3575)---mdcmd(3576)
        |-crond(1602)
        |-dbus-daemon(1538)

ps -e x -o ppid -o pid -o pgid -o tty -o vsz -o rss -o eti                         me -o cputime -o rgroup -o ni -o fname -o args | grep "cache_dirs\|find\|wc"
 9820  3574  4591 ?         14416  3204    16:19:06 00:00:00 root       0 cache_                         di /bin/bash /usr/local/emhttp/plugins/dynamix.cache.dirs/scripts/cache_dirs -e                          .FTP -e Backup -e CAAppdataBackup -e appdata -e system -l on
 3574  3575  4591 ?         14416  3208    16:19:06 16:19:05 root       0 cache_                         di /bin/bash /usr/local/emhttp/plugins/dynamix.cache.dirs/scripts/cache_dirs -e                          .FTP -e Backup -e CAAppdataBackup -e appdata -e system -l on
    1  9820  4591 ?         14416  4168  3-01:00:46 00:03:16 root       0 cache_                         di /bin/bash /usr/local/emhttp/plugins/dynamix.cache.dirs/scripts/cache_dirs -e                          .FTP -e Backup -e CAAppdataBackup -e appdata -e system -l on
29926 30702 30701 pts/0      9812  2068       00:00 00:00:00 root       0 grep                              grep cache_dirs\|find\|wc?
top - 20:16:35 up 162 days, 22:45,  1 user,  load average: 1.05, 1.07, 1.07
Tasks: 346 total,   2 running, 236 sleeping,   0 stopped,   1 zombie
%Cpu(s): 25.9 us,  0.7 sy,  0.0 ni, 73.4 id,  0.0 wa,  0.0 hi,  0.0 si,  0.0 st
KiB Mem : 16113940 total,  5576676 free,  7793844 used,  2743420 buff/cache
KiB Swap:        0 total,        0 free,        0 used.  6645500 avail Mem

  PID USER      PR  NI    VIRT    RES    SHR S  %CPU %MEM     TIME+ COMMAND
 3575 root      20   0   14416   3208   1496 R 100.0  0.0 995:13.86 cache_dirs
17977 nobody    20   0 2299916 181864   4132 S   2.7  1.1 415:15.15 sabnzbdplus
31464 root      20   0  572708  35896   1972 S   2.0  0.2 364:10.26 cadvisor
 4208 root      20   0  294864   3232   2524 S   0.3  0.0 878:32.51 emhttpd
 4730 root      20   0 1395280  89652    824 S   0.3  0.6   1295:37 shfs
 5543 root      20   0 2237536  58188  28128 S   0.3  0.4 265:50.77 dockerd
25051 nobody    20   0  174252  31572   1948 S   0.3  0.2   4:21.81 python
28802 daemon    20   0 4818076 341044  10692 S   0.3  2.1 737:11.89 EmbyServer
    1 root      20   0    4504    772    712 S   0.0  0.0   1:17.76 init
    2 root      20   0       0      0      0 S   0.0  0.0   0:01.28 kthreadd
    4 root       0 -20       0      0      0 I   0.0  0.0   0:00.00 kworker/0:0H
    6 root       0 -20       0      0      0 I   0.0  0.0   0:00.00 mm_percpu_wq
    7 root      20   0       0      0      0 S   0.0  0.0  13:37.10 ksoftirqd/0
    8 root      20   0       0      0      0 I   0.0  0.0 205:33.34 rcu_preempt
    9 root      20   0       0      0      0 I   0.0  0.0   0:08.05 rcu_sched
   10 root      20   0       0      0      0 I   0.0  0.0   0:00.00 rcu_bh
   11 root      rt   0       0      0      0 S   0.0  0.0   0:21.45 migration/0
   12 root      20   0       0      0      0 S   0.0  0.0   0:00.00 cpuhp/0
   13 root      20   0       0      0      0 S   0.0  0.0   0:00.00 cpuhp/1
   14 root      rt   0       0      0      0 S   0.0  0.0   0:21.40 migration/1
   15 root      20   0       0      0      0 S   0.0  0.0  10:54.18 ksoftirqd/1
   17 root       0 -20       0      0      0 I   0.0  0.0   0:00.00 kworker/1:0H
   18 root      20   0       0      0      0 S   0.0  0.0   0:00.00 cpuhp/2
   19 root      rt   0       0      0      0 S   0.0  0.0   0:21.53 migration/2
   20 root      20   0       0      0      0 S   0.0  0.0  11:02.52 ksoftirqd/2
   22 root       0 -20       0      0      0 I   0.0  0.0   0:00.00 kworker/2:0H
   23 root      20   0       0      0      0 S   0.0  0.0   0:00.00 cpuhp/3
   24 root      rt   0       0      0      0 S   0.0  0.0   0:21.31 migration/3
   25 root      20   0       0      0      0 S   0.0  0.0   9:42.75 ksoftirqd/3
   27 root       0 -20       0      0      0 I   0.0  0.0   0:00.00 kworker/3:0H
   28 root      20   0       0      0      0 S   0.0  0.0   0:00.00 kdevtmpfs
   29 root       0 -20       0      0      0 I   0.0  0.0   0:00.00 netns
  291 root      20   0       0      0      0 S   0.0  0.0   0:00.00 oom_reaper
  292 root       0 -20       0      0      0 I   0.0  0.0   0:00.00 writeback

 

cache_dirs-20181207_1544175601.log

cache_dirs.log

cache_dirs-20181206_1544089201.log

Edited by Necrotic
Link to comment

So I stopped it last night, changed it from adaptive to fixed. Started the plugin again.

I arrived this afternoon and find one core pegged at 100% once more. The log stops at 5:58am.

I guess I will have to disable this until a solution can be found, I will try to update unraid at some point and see if that resolves it or perhaps find a way to roll to an earlier version of cache_dirs.

Link to comment
On 12/19/2018 at 10:19 AM, niwmik said:

had the 100% cpu on one core issue on 3 different unraid 6.5.3 with cache_dirs version 2.2.5.  upgraded all 3 servers to 6.6.6 and within 48 hours uptime, i haven't seen the 100% cpu issue.

 

Thanks for the update niwmik, please let me know how it proceeds from here.

Link to comment

I used to use cache dirs long ago in unraid ver 5, and don't remember for certain when it dropped.  

I've been having some large hesitations in the navigation of my Plex directories and thought perhaps caching folders again may help.  It seemed like certain functions were hesitating for drives to spin up.

After turning cache dirs (folder caching), drives are spun up seemingly often even though changes aren't being made, and while CPU usage certainly isn't pegged it runs from 15-40% pretty consistently.  

 

I've just been using the default settings, is there some way to control this?  I would rather it not scan unless it knows something has changed, or maybe once a day outside of known changes.

 

** note, while typing this I went and added "app data, dockers etc" to the exclusion list on the thought that perhaps Plex constantly doing it's thing may be triggering cache dirs to re-scan.   I'll keep an eye on it to see if things change.

Edited by TODDLT
Link to comment
2 hours ago, TODDLT said:

** note, while typing this I went and added "app data, dockers etc" to the exclusion list on the thought that perhaps Plex constantly doing it's thing may be triggering cache dirs to re-scan.

 

A recent update to cache dirs somehow cleared out all my exclusions.  I noticed over the past few days that CPU usage was higher than normal with one or two alternating CPUs hitting 80-100% every 5-10 seconds. 

 

htop showed that the process using the CPU was "/mnt/user/appdata -noleaf" (cache dirs).  I then checked my cache dirs setting and noticed that all the exclusions had disappeared.  Resetting the exclusions (appdata, and several other less-used folders) brought CPU usage back to normal.

Edited by Hoopster
Link to comment
On 12/22/2018 at 4:35 PM, Hoopster said:

 

A recent update to cache dirs somehow cleared out all my exclusions.  I noticed over the past few days that CPU usage was higher than normal with one or two alternating CPUs hitting 80-100% every 5-10 seconds. 

 

htop showed that the process using the CPU was "/mnt/user/appdata -noleaf" (cache dirs).  I then checked my cache dirs setting and noticed that all the exclusions had disappeared.  Resetting the exclusions (appdata, and several other less-used folders) brought CPU usage back to normal.

Thanks,

 

After watching this for a few days it appears the exclusions have solved my issues too.

 

 

Link to comment
On 12/22/2018 at 4:35 PM, Hoopster said:

 

A recent update to cache dirs somehow cleared out all my exclusions.  I noticed over the past few days that CPU usage was higher than normal with one or two alternating CPUs hitting 80-100% every 5-10 seconds. 

 

htop showed that the process using the CPU was "/mnt/user/appdata -noleaf" (cache dirs).  I then checked my cache dirs setting and noticed that all the exclusions had disappeared.  Resetting the exclusions (appdata, and several other less-used folders) brought CPU usage back to normal.

Was it completely gone from the settings or was it listed and just not working correctly?

Link to comment
34 minutes ago, Necrotic said:

Was it completely gone from the settings or was it listed and just not working correctly?

Gone.  Exclusions was blank when I know at one time I had excluded appdata and several other folders specifically for this reason; it used too much CPU too frequently. 

 

Once I entered the exclusions again, things returned to normal.

Link to comment
9 hours ago, FrozenGamer said:

I removed appdata from scanned directories and no longer have 100% cpu spikes.  Wish i had known that a long time ago, hopefully my server will be more responsive without the spikes.

Its easier to just add what you need to index,  but never use include and exclude together. But thats what they said from the beginning... 

Link to comment
  • 2 months later...

Since I've updated to 6.6.7 or even before this doesn't seem to index my files anymore. I installed the latest from CA and even installed the Beta from up above several posts. When I launch the Script I've spun up all my drives and literally watched all my drives. They never seem to get read and when they do spin down check a share which in my case would be TV and Movies which I have set as my Includes and all my drives spin up that should of been indexed. 

 

Yes I've even used the help and changed the two settings that say read me if your drives spin up.

Link to comment
On 3/6/2019 at 8:50 AM, kizer said:

Since I've updated to 6.6.7 or even before this doesn't seem to index my files anymore. I installed the latest from CA and even installed the Beta from up above several posts. When I launch the Script I've spun up all my drives and literally watched all my drives. They never seem to get read and when they do spin down check a share which in my case would be TV and Movies which I have set as my Includes and all my drives spin up that should of been indexed. 

 

Yes I've even used the help and changed the two settings that say read me if your drives spin up.

Hmm, I think I've recently noticed that too with my v6.6.6. I upgraded to v6.6.6 quite some time ago, but was just checking my cache settings now. I noticed my Stats->Memory->Cached is very low, 806.3 MiB. Usually the cached usage is around 6GB which seems really high usage just for cache. My plugin status was Running. Using plugin 2018.11.18. I've disabled and enabled the cache and now after a few minutes it went up to 806.4 MiB. So maybe it's kinda working, but not fully?

Link to comment
  • 3 weeks later...

HI everyone,

 

I'm using cache_dirs for a while, and I noticed my disks always spin up when accessing root folders which cache_dirs should avoid. i went to investigate and it seems that it might not be doing anything. 

I restarted it, and left it running overnight and got this:

 

540947424_Screenshot2019-03-29at22_51_53.thumb.png.a6c3b089c8babd474e2ed2f9cfe9df96.png

 

I checked during the day, and it was the same as now. No memory increase on the process, seems to be stoped all the time, with some sporadic 0,7% cpu spikes. And the uptime seems to indicate the process restarted.

 

And the cache_dirs process is using the same exact memory as when I started it.....strange. Also the up time indicates that it might have crashed.

Here are my settings:

 

1174598965_Screenshot2019-03-29at00_49_38.thumb.png.f64a9da75616ba1e01b3adcd1d0de752.png

Also tried cache pressure as default (10) to with no change on the behaviour.

 

Also I noticed that no logs where created in the respective folder:

 

1544242803_Screenshot2019-03-29at00_44_31.thumb.png.b79fa671811992b5e5dea491f0977734.png

 

My server has 32gb of ram and 16tb of storage. Unraid version 6.7.0-rc6 and the latest version of cache_dirs.

This is all so fishy, am I doing something wrong, I tried default settings, and already tried to exclude some recommended directories with no success.

What to do now?

 

Thanks.

Link to comment
12 hours ago, Alex R. Berg said:

I have no idea. If you run it manually the cache_dirs script, maybe its easier for you to figure out what is going on. There is also a switch to disable background, which may also help in your debugging. You have the command in the top you posted, or at least the first half of it.

 

Best Alex

Hi,

 

Can't seem to find the switch to disable background anywhere. Were is it specifically?

Also how can I run the script manually? Coding is not something I'm into, sorry.

 

I also noticed that the command spawns some child commands that last for less than a second and then go away, is this normal, or cache_dirs crashing?

 

See the moment they appear:

 

96821242_Screenshot2019-04-01at23_28_21.thumb.png.76395e46d00d729f7656a0044823c765.png

 

Another thing I tried downgrading unraid to the previous version (6.7.0-rc5) and I still have the same behaviour.

 

Thanks.

 

Link to comment
On 4/1/2019 at 11:33 PM, l3gion said:

Hi,

 

Can't seem to find the switch to disable background anywhere. Were is it specifically?

Also how can I run the script manually? Coding is not something I'm into, sorry.

 

I also noticed that the command spawns some child commands that last for less than a second and then go away, is this normal, or cache_dirs crashing?

 

See the moment they appear:

 

96821242_Screenshot2019-04-01at23_28_21.thumb.png.76395e46d00d729f7656a0044823c765.png

 

Another thing I tried downgrading unraid to the previous version (6.7.0-rc5) and I still have the same behaviour.

 

Thanks.

 

I'm sorry I don't have the time to go into a detailed problem solving session. Its probably uphill for you if you are not familiar with scripts, but here's some hints. 

 

cache_dirs should be in path, otherwise use `/usr/local/emhttp/plugins/dynamix.cache.dirs/scripts/cache_dirs`

 

```

 sudo /usr/local/emhttp/plugins/dynamix.cache.dirs/scripts/cache_dirs -h
```

gives you the commands it runs. 

 

It spaws find-subprocesses, which does the actual reading of dirs, so yeah it indicates cache_dirs is not your problem, though of cause maybe it caches to little, when you see those folders. Maybe you have to little ram, but that's just guess-work.

 

Cache_dirs only caches /mnt/disk* and /mnt/cache, not other root folders, but those root folders should be mounted in memory-space on UnRaid. Unless you've done some manual mounts, which I doubt you have.

 

Best Alex

Link to comment
  • 2 weeks later...
  • 4 weeks later...

VM Performance Issues with cachedirs enabled.  Hi everyone, I'm just looking for a few hints around where to look within the folder caching options to identify what is causing some fairly serious performance issues.  I was suspicious this was actually a networking issue at first but it turns out when I turn off the folder caching plugin my performance issues go away.

 

I have a 32 thread thread ripper, 128GB RAM, and various HDD / SDD / NVME.  The windows gaming vm on the NVME works flawlessly with this plugin off, but when I turn it on, the screen freezes while playing, sometimes for multiple seconds between 3-5 by which time I've been killed lol.

 

It's taken a long time to figure this out - because it really wasn't obvious.  I have a CrashPlan backup system that runs permanently in the background which is why I need to have almost all folders available to this plugin - otherwise CrashPlan starts up the disks all the time to check for changed files.  I could of course change this to daily or something - but really that's not a great solution.

 

See below screenshot for my current settings.  (I'm running Unraid 6.7.0 RC8).

 

Do I need to increase the shell memory for a large number of files?  I have about 48TB of storage.

 

Thanks.

 

76832616_ScreenShot2019-05-11at09_35_15.thumb.png.448ae50809e5eec3519cd3f9dfd55385.png

Link to comment

Wow, nice machine :) Its low-level stuff in linux that does the reading of the directory structure. Cache-dirs itself just calls many find-processes. I think I mentioned in previous messages about how many files I have and what memory I use, so if you skim above you might find something. I cant think of anything helpful for you, except experiment. You can see if cache-dirs is currently scanning disks, check the cache_dirs debug flags, the script contains some statements that might help you debug, if you are not already an expert at linux. 

 

Link to comment

Join the conversation

You can post now and register later. If you have an account, sign in now to post with your account.
Note: Your post will require moderator approval before it will be visible.

Guest
Reply to this topic...

×   Pasted as rich text.   Restore formatting

  Only 75 emoji are allowed.

×   Your link has been automatically embedded.   Display as a link instead

×   Your previous content has been restored.   Clear editor

×   You cannot paste images directly. Upload or insert images from URL.