cache_dirs - an attempt to keep directory entries in RAM to prevent disk spin-up

Joe L. · June 21, 2013

some interesting ideas... I'll give them some thought.

Does this comment also apply to my note r.e. suspending Cache_Dirs during parity checks?

It sure seems that would be a VERY useful/nice feature. Seems like a simple "If parity check in progress, don't start the next Find" check in Cache_Dirs would basically suspend it => at least after the current Find completed. No reason to shut down Cache_Dirs ... it would simply keep checking at the current intervals, but as long as the parity check was in progress, wouldn't initiate any more Finds ... thus not interfering with the check.

... and of course once the parity check was over, the next time Cache_Dirs checked all the disks would still be spinning, so it'd be very quickly up-to-date.

It does this today (puts itself to sleep) when the "mover" runs, so adding the logic for a parity check/disk rebuild is fairly easy.

garycase · June 21, 2013

some interesting ideas... I'll give them some thought.

Does this comment also apply to my note r.e. suspending Cache_Dirs during parity checks?

It sure seems that would be a VERY useful/nice feature. Seems like a simple "If parity check in progress, don't start the next Find" check in Cache_Dirs would basically suspend it => at least after the current Find completed. No reason to shut down Cache_Dirs ... it would simply keep checking at the current intervals, but as long as the parity check was in progress, wouldn't initiate any more Finds ... thus not interfering with the check.

... and of course once the parity check was over, the next time Cache_Dirs checked all the disks would still be spinning, so it'd be very quickly up-to-date.

It does this today (puts itself to sleep) when the "mover" runs, so adding the logic for a parity check/disk rebuild is fairly easy.

I thought it might be. Does that mean you're going to add it and update the Cache_Dirs download?

Joe L. · June 21, 2013

some interesting ideas... I'll give them some thought.

Does this comment also apply to my note r.e. suspending Cache_Dirs during parity checks?

It sure seems that would be a VERY useful/nice feature. Seems like a simple "If parity check in progress, don't start the next Find" check in Cache_Dirs would basically suspend it => at least after the current Find completed. No reason to shut down Cache_Dirs ... it would simply keep checking at the current intervals, but as long as the parity check was in progress, wouldn't initiate any more Finds ... thus not interfering with the check.

... and of course once the parity check was over, the next time Cache_Dirs checked all the disks would still be spinning, so it'd be very quickly up-to-date.

It does this today (puts itself to sleep) when the "mover" runs, so adding the logic for a parity check/disk rebuild is fairly easy.

I thought it might be. Does that mean you're going to add it and update the Cache_Dirs download?

Not today, I've got a very busy afternoon/evening.

garycase · June 21, 2013

Not today, I've got a very busy afternoon/evening.

Certainly didn't mean to imply "now" ==> just sometime in the relatively near future ... anytime in the next few weeks is fine by me. I'd be nice if you'd post a note when it's done.

I'm sure I'm not the only one who'd appreciate that modification.

Guzzi · June 21, 2013

+1 .... currently always manually deactivating cachedirs before starting parity check ;-)

JonathanM · June 21, 2013

Not today, I've got a very busy afternoon/evening.

Darn, there goes your streak of sub 1 hour response times.

RobJ · June 22, 2013

Does this comment also apply to my note r.e. suspending Cache_Dirs during parity checks?

It sure seems that would be a VERY useful/nice feature. Seems like a simple "If parity check in progress, don't start the next Find" check in Cache_Dirs would basically suspend it => at least after the current Find completed. No reason to shut down Cache_Dirs ... it would simply keep checking at the current intervals, but as long as the parity check was in progress, wouldn't initiate any more Finds ... thus not interfering with the check.

Gary, you seem very certain that this would be useful so I assume you have either thought it through or have seen tests that showed an advantage to disabling CacheDirs? I've tried thinking it through (but no tests done), and I can't see any advantage at all, and in fact it looks to me as if it could actually slow the parity check down a tiny bit. Can you explain your logic, help me see what I'm missing here?

As far as I can see, a parity check and CacheDirs are 2 independent operations. The parity check heavily uses the physical I/O busses and a little CPU time, but does not use the file system, and I don't thing it even uses any memory caching (could be wrong though). A properly setup CacheDirs does not use any physical I/O, does use some CPU, and does heavily access the buffered file system entries (the dentry cache as was stated above), in order to keep marking each dir entry as too important to drop. Both need CPU time, but I doubt if either needs enough to impact the other. An enabled CacheDirs should keep the Parity check humming along without interruption. Disabling CacheDirs would disable that protection, and allow other file system accesses to force pausing of the parity check to move the heads elsewhere to service other needs no longer cached. My SageTV polls the video folders every 5 minutes, and without CacheDirs running would require physical access to those folders.

... and of course once the parity check was over, the next time Cache_Dirs checked all the disks would still be spinning, so it'd be very quickly up-to-date.

A small point, but as to drives still spinning, that only applies to those users whose drives are all the same size. Most of us have many sizes,and many of our smaller disks have spun down well before the end of the check. In my array, 8 of the 10 drives spin down before the end. They would all have to spin up on re-enabling CacheDirs, but that's not a big deal.

WeeboTech · June 22, 2013

I think each system and usage is different. For some systems, you may want to disable cache_dirs, and others benefit from leaving it on. It all depends on how many disks, how many files and how much ram you have.

Robj's point has merit.

The only benefit of pausing cache_dirs on a parity check is if you see it affecting your parity check because it has to read the disks. Good tuning and/or disk selection may change that.

garycase · June 22, 2013

... in fact it looks to me as if it could actually slow the parity check down a tiny bit.

Running Cache_Dirs slows down parity checking ... disabling it does not. (see next comment)

As far as I can see, a parity check and CacheDirs are 2 independent operations.

The parity check heavily uses the physical I/O busses and a little CPU time

A properly setup CacheDirs does not use any physical I/O

Indeed, they are two independent operations. But Cache_Dirs definitely DOES do some physical I/O ... whenever the buffers containing the directories are overwritten it has to re-read the directory info.

Parity checking is obviously VERY I/O intensive ... reading ALL of the disks as quickly as it can and confirming that the parity is correct. Think of this process ... every disk is read IN ORDER, so there's virtually NO head movement except for single-cylinder seeks as the disk is traversed. A LOT of data is being buffered, so the data buffers that are holding the directory info for Cache_Dirs are clearly going to be overwritten ... and when Cache_Dirs does its next check, it's going to re-read the directory entries to try and keep the buffered directory info up-to-date. THOSE reads are going to require some seeks -- which has two impacts: (1) the disks get thrashed a bit by the extra seek operations; and (2) the time for these reads is added to the parity check time, since no parity checking can be done until it can continue reading data for the check. Note that, while quick, seeks are nevertheless the "long pole in the tent" in terms of disk operations ... i.e. they're VERY LONG compared to all the other things that are going on [10-15ms sounds quick, but when you do it enough thousands of times it adds up].

The same thing is true for ANY ongoing accesses during a parity check. Streaming a movie; writing a bunch of new data; copying files from the array; etc. all cause significant disk thrashing that will notably slow down the parity check time. NOT a particularly large amount of time in most cases, but nevertheless it's unnecessary disk thrashing ... which I like to avoid. [i NEVER watch a movie during a parity check ... or do anything else on the array either]

By the way, think about the process: IF you were right and there was NO physical I/O required by Cache_Dirs, then the impact of a check that says "If Parity Check in progress, do not start a Find operation" would be ZERO => since no Find would be required anyway. So all this check does is GUARANTEE that no physical disk I/O will be initiated by Cache_Dirs until the parity check has completed

A small point, but as to drives still spinning, that only applies to those users whose drives are all the same size. Most of us have many sizes,and many of our smaller disks have spun down well before the end of the check. In my array, 8 of the 10 drives spin down before the end. They would all have to spin up on re-enabling CacheDirs, but that's not a big deal.

You're absolutely right -- I had a "senior moment" when I wrote that My newest system has all 3TB WD Reds; but my older media server has a mix of 1, 1.5, and 2TB drives and the 1 & 1.5's are indeed spun down by the end of a parity check ... so the first time Cache_Dirs "asks" if a parity check is in progress and the answer's "No", so it starts another Find, those drives will likely have to spin up. Not a big deal ... but definitely a wrong statement on my part !!

RobJ · June 22, 2013

Your explanation helped me see that I've made a couple of assumptions which may or may not be true. One is that the caching of directory entries is completely separate from the general data buffers. I had thought that the existence of the dentry cache implied that they were managed separately, and no amount of disk thrashing (apart from directory loading) would ever flush them. This may or may not be so, and I'll happily learn from the true kernel experts here.

The other assumption was based on my predilection as to the most efficient way to do a parity check operation. If I had programmed it, based on the principles of earlier OS's, I would have assigned a single buffer per drive, plus a few work buffers, and simply reread into them while cycling through the blocks. But of course, modern I/O systems are layered, and you can't get as close to the hardware now, so additional disk buffering may be happening at lower levels than my read requests. But I still can't help thinking (more assuming here!) that the disk I/O from the parity check is all at a lower level than the file system buffering, and therefore is not flushing anything that is filesystem related. I'll await correction from the more knowledgeable here.

At any rate, it seemed wise to run my own test, so I have started a parity check. I disabled SimpleFeatures, and made sure the system is relatively clean, only powerdown and UnMENU and CacheDirs, and restarted the server. I started the check, then immediately disconnected the network cable, to disallow all external disk access. I'll monitor a tail on the monitor to know when it's done. Then I'll disable CacheDirs and rerun again the same way. Unless I've forgotten something, that should be a fair test.

WeeboTech · June 22, 2013

Your explanation helped me see that I've made a couple of assumptions which may or may not be true. One is that the caching of directory entries is completely separate from the general data buffers. I had thought that the existence of the dentry cache implied that they were managed separately, and no amount of disk thrashing (apart from directory loading) would ever flush them. This may or may not be so, and I'll happily learn from the true kernel experts here.

The buffers are in low memory. Even if you have a large amount of ram, you can still have pressure in low memory.

In addition there is vfs_cache_pressure. This has to be tuned to keep dentries in memory as long as possible.

Coupled with the fixed size of the dentry has table, the tuned amount of md_num_stripes, md_write_limit, md_sync_window and the actual number of files found, each system has it's own set of values before IO occurs to the disk.

I once tried rebooting the system and expanding the dentry table to 2x its value and 3x it's value. I soon ran out of low memory on each boot.

vfs_cache_pressure
------------------


Controls the tendency of the kernel to reclaim the memory which is used for
caching of directory and inode objects.


At the default value of vfs_cache_pressure=100 the kernel will attempt to
reclaim dentries and inodes at a "fair" rate with respect to pagecache and
swapcache reclaim.  Decreasing vfs_cache_pressure causes the kernel to prefer
to retain dentry and inode caches. When vfs_cache_pressure=0, the kernel will
never reclaim dentries and inodes due to memory pressure and this can easily
lead to out-of-memory conditions. Increasing vfs_cache_pressure beyond 100
causes the kernel to prefer to reclaim dentries and inodes.

garycase · June 23, 2013

Clearly the amount of memory in the system; the settings of the various tunable disk parameters; the number of directory entires that need to be cached; any other activity that may be going on; etc. can all have some impact.

But a check that simply says "Don't do any disk activity if a parity check is in progress" would certainly be at worst neutral (if no Finds were needed); and at best would stop all Cache_Dirs related disk I/O during the parity check if Finds were necessary to update the directories. All this will do is GUARANTEE that no physical disk I/O will be initiated by Cache_Dirs until the parity check has completed

By the way, Joe indicated he's going to also do this for disk rebuilds ... which is certainly a good idea; as these also are other completely sequential I/O operations on the disks, so any unnecessary thrashing will slow them down as well.

garycase · June 23, 2013

At any rate, it seemed wise to run my own test, so I have started a parity check. I disabled SimpleFeatures, and made sure the system is relatively clean, only powerdown and UnMENU and CacheDirs, and restarted the server. I started the check, then immediately disconnected the network cable, to disallow all external disk access. I'll monitor a tail on the monitor to know when it's done. Then I'll disable CacheDirs and rerun again the same way. Unless I've forgotten something, that should be a fair test.

I was going to do exactly that, but since you've already started it, I'll wait and see what your results are. It can, of course, depend on your tunable disk parameters ... I have mine set to use significantly more buffers during parity checks than the default (which changed my parity check times from ~ 8:25 to 7:41).

I do not, by the way, anticipate a major difference in the times ... but I do think it will take longer with Cache_Dirs active; and don't see any reason to have the unnecessary disk thrashing going on when it can easily be avoided.

WeeboTech · June 23, 2013

I'm not saying that quiescing cache_dirs is a bad idea, nor preventing it from quiescing a bad idea.

I think the option should exist for people who need it.

In one case, a user reports it slows down full array parity operations.

Another user reports, it makes very little different.

I can say from experience, cache_dirs caused problems on my array. Not because of a programming issue, just the sheer magnitude of files I had on my array.

So point is, It would be helpful if it is programmed as an option for those who need it or choose to use it.

Joe L. · June 23, 2013

I'm not saying that quiescing cache_dirs is a bad idea, nor preventing it from quiescing a bad idea.

I think the option should exist for people who need it.

In one case, a user reports it slows down full array parity operations.

Another user reports, it makes very little different.

I can say from experience, cache_dirs caused problems on my array. Not because of a programming issue, just the sheer magnitude of files I had on my array.

So point is, It would be helpful if it is programmed as an option for those who need it or choose to use it.

I am in the middle of making the necessary code changes, but am trying to work out how to determine if a parity check/sync/reconstruction is in progress.

The only way I know is to query /proc/mdcmd, but since on a properly tuned system, the "find loop" occurs every 10 seconds, or less, I need to code cache_dirs to only look for a "parity sync/calc/reconstruction" on a less frequent basis than every few seconds... probably every 5 minutes or so would do it, otherwise, it might impact the md driver if the status command was invoked too frequently.

Then, cache_dirs would suspend itself within the first 5 minutes of a parity sync/calc.

Ideas? Feedback?

Joe L.

garycase · June 23, 2013

Five minutes sounds okay to me. If the Find loop runs every 10 seconds, that's about 30 iterations in 5 minutes ... so even if every iteration resulted in a 50-100ms "thrash & cache" [directory read] that would only be an extra couple seconds for the parity check.

Sounds like the process is a bit more complex than I envisoned ... I thought it could simply be a fixed test for parity check/disk rebuild with two exits: (a) do nothing; or (b) do the next Find. Sounds like you need a completely different loop that tests for an in-process parity check/rebuild and sets or clears a "suspend" flag for Cache_Dirs.

If it's easier, I'd be quite happy with a simple button in UnMenu (perhaps on the User Scripts page) that would suspend Cache_Dirs [With, of course, a corresponding one to "un-suspend it" ] Wouldn't be as nice as an automatic suspension, but simple enough to just do it before running a parity check. [Which also makes me wonder why you've never added Cache_Dirs to Unmenu ??]

WeeboTech · June 23, 2013

I thought there was a value from the /proc/mdcmd that had a time value in seconds.

mdResync=3907018532

mdResyncCorr=1

mdResyncPos=4265472

mdResyncDt=30

mdResyncDb=4265472


                /* time delta in seconds */
                dt = ((jiffies - mddev->resync_mark) / HZ);
                if (!dt) dt++;
                p += sprintf(p, "mdResyncDt=%lu\n", dt);
                
                /* resync'ed blocks delta */
                db = resync - (mddev->resync_mark_cnt/2);
                p += sprintf(p, "mdResyncDb=%llu\n", db);
            
            
                /* tmm: following code is omitted because it requires 64-bit division
                 * which requires use of do_div() and is a pain-in-the-neck to use,
                 * instead we output the two deltas above so that user space can do
                 * whatever cacluation it wants.
                 * 
                 * We do not want to overflow, so the order of operands and
                 * the * 100 / 100 trick are important. We do a +1 to be
                 * safe against division by zero. We only estimate anyway.
                 *
                 * dt: time from mark until now
                 * db: blocks written from mark until now
                 * rt: remaining time
                 */
//              unsigned long long max_blocks = mddev->recovery_running/2;
//              unsigned long long res;
//              unsigned long long rt;
                /* compute completion percentage */
//              res = resync / (max_blocks/1000 + 1);
//              p += sprintf(p, "mdResyncPrcnt=%llu.%llu\n", res/10, res % 10);
                /* sync rate in blocks/sec */
//              p += sprintf(p, "mdResyncSpeed=%llu\n", db/dt);
                /* compute remaining time in minutes */
//              rt = (dt * ((max_blocks-resync) / (db/100+1)))/100;
//              p += sprintf(p, "mdResyncFinish=%llu.%llu\n", rt / 60, (rt % 60)/6);
        }
        else {
                p += sprintf(p, "mdResyncPos=0\n");
                p += sprintf(p, "mdResyncDt=0\n");
                p += sprintf(p, "mdResyncDb=0\n");
        }

Once you determine a lengthy process is executing, change the frequency of the loop, Expand it by minutes, tens of minutes, etc, etc.

RobJ · June 23, 2013

Initial result, parity check with CacheDirs running took 25247 seconds, parity check without CacheDirs took 25462 seconds. The difference may not be statistically significant, .0085% longer (I think), but does not indicate a CacheDirs impact. The first test with CacheDirs running was immediately after a fresh boot, so it is possible that someone might claim a fresh system advantage that the second test did not have, but it is hard to imagine any advantage maintained over the long check process. In any event, I'll be happy to repeat it to rule that out, if anyone wants.

I've started a new parity check (CacheDirs still off) with network cable connected, thereby enabling any polling by my SageTV on another machine. This should be similar to those of us with media players/managers or other external process(es) that periodically poll folders on the server. Then I'll run one more test with CacheDirs re-enabled, network connected.

My tunables are relatively standard, with 1GB of RAM:

md_num_stripes 1280

md_write_limit 768

md_sync_window 384

I believe my vfs_cache_pressure is 0, which is what I have preferred since I have never ever had an OutOfMemory condition, but I'm unsure of how to obtain the value of vfs_cache_pressure. If someone can give me the command, I'll verify it.

Jun 22 16:06:33 JacoBack emhttp_event: svcs_restarted
Jun 22 16:07:00 JacoBack init: Re-reading inittab
Jun 22 16:09:00 JacoBack cache_dirs: command args=-w -m 3 -M 3 -a -noleaf -i Vid*, version=1.5r
Jun 22 16:09:00 JacoBack cache_dirs: max_seconds=3, min_seconds=3, max_depth=9999, command=find -noleaf
Jun 22 16:09:00 JacoBack cache_dirs: VidLib3,VidLib4,Videos,Videos1,Videos2,Videos3,Videos4,Videos5,Videos6,Videos7,Videos8,Videos9, root_dirs
Jun 22 16:09:01 JacoBack cache_dirs: cache_dirs process ID 1515 started, To terminate it, type: cache_dirs -q
Jun 22 16:11:09 JacoBack login[1522]: ROOT LOGIN  on '/dev/tty1'
Jun 22 16:11:40 JacoBack kernel: mdcmd (49): check CORRECT
Jun 22 16:11:40 JacoBack kernel: md: recovery thread woken up ...
Jun 22 16:11:40 JacoBack kernel: md: recovery thread checking parity...
Jun 22 16:11:40 JacoBack kernel: md: using 1536k window, over a total of 1953514552 blocks.
Jun 22 16:11:48 JacoBack kernel: skge 0000:01:04.0: eth0: Link is down
Jun 22 18:20:42 JacoBack kernel: mdcmd (50): spindown 5
Jun 22 18:20:43 JacoBack kernel: mdcmd (51): spindown 7
Jun 22 18:43:44 JacoBack kernel: mdcmd (52): spindown 6
Jun 22 19:41:28 JacoBack kernel: mdcmd (53): spindown 1
Jun 22 19:41:28 JacoBack kernel: mdcmd (54): spindown 3
Jun 22 20:14:30 JacoBack kernel: mdcmd (55): spindown 8
Jun 22 21:24:46 JacoBack kernel: mdcmd (56): spindown 2
Jun 22 21:24:47 JacoBack kernel: mdcmd (57): spindown 4
Jun 22 23:12:28 JacoBack kernel: md: sync done. time=25247sec
Jun 22 23:12:28 JacoBack kernel: md: recovery thread sync completion status: 0
Jun 22 23:43:42 JacoBack kernel: skge 0000:01:04.0: eth0: Link is up at 1000 Mbps, full duplex, flow control both
Jun 22 23:54:58 JacoBack emhttp: Spinning up all drives...
Jun 22 23:54:58 JacoBack kernel: mdcmd (58): spinup 0
Jun 22 23:54:58 JacoBack kernel: mdcmd (59): spinup 1
Jun 22 23:54:58 JacoBack kernel: mdcmd (60): spinup 2
Jun 22 23:54:58 JacoBack kernel: mdcmd (61): spinup 3
Jun 22 23:54:58 JacoBack kernel: mdcmd (62): spinup 4
Jun 22 23:54:58 JacoBack kernel: mdcmd (63): spinup 5
Jun 22 23:54:58 JacoBack kernel: mdcmd (64): spinup 6
Jun 22 23:54:58 JacoBack kernel: mdcmd (65): spinup 7
Jun 22 23:54:58 JacoBack kernel: mdcmd (66): spinup 8
Jun 22 23:54:58 JacoBack kernel: mdcmd (67): spinup 9
Jun 22 23:56:32 JacoBack cache_dirs: killing cache_dirs process 1515
Jun 22 23:57:56 JacoBack kernel: mdcmd (68): check CORRECT
Jun 22 23:57:56 JacoBack kernel: md: recovery thread woken up ...
Jun 22 23:57:56 JacoBack kernel: md: recovery thread checking parity...
Jun 22 23:57:56 JacoBack kernel: md: using 1536k window, over a total of 1953514552 blocks.
Jun 22 23:58:04 JacoBack kernel: skge 0000:01:04.0: eth0: Link is down
Jun 23 02:09:44 JacoBack kernel: mdcmd (69): spindown 5
Jun 23 02:09:44 JacoBack kernel: mdcmd (70): spindown 7
Jun 23 02:32:55 JacoBack kernel: mdcmd (71): spindown 6
Jun 23 03:30:39 JacoBack kernel: mdcmd (72): spindown 1
Jun 23 03:30:39 JacoBack kernel: mdcmd (73): spindown 3
Jun 23 04:03:41 JacoBack kernel: mdcmd (74): spindown 8
Jun 23 05:14:37 JacoBack kernel: mdcmd (75): spindown 2
Jun 23 05:14:38 JacoBack kernel: mdcmd (76): spindown 4
Jun 23 07:02:11 JacoBack kernel: md: sync done. time=25462sec
Jun 23 07:02:11 JacoBack kernel: md: recovery thread sync completion status: 0
Jun 23 08:02:19 JacoBack kernel: mdcmd (77): spindown 0
Jun 23 08:02:20 JacoBack kernel: mdcmd (78): spindown 9
Jun 23 09:25:22 JacoBack kernel: skge 0000:01:04.0: eth0: Link is up at 1000 Mbps, full duplex, flow control both

garycase · June 23, 2013

Very interesting ... and surprising ... results. I didn't expect a big difference, but I DID expect that it would be better with Cache_Dirs NOT running.

It is, in fact, very hard to understand how Cache_Dirs could actually IMPROVE the time !!

One obvious factor is how many files you have cached. Is Cache_Dirs set to cache ALL of your files? ... My v5 server has, for example, 270,128 files in 20,651 folders.

Your results are definitely intriguing, however => I'm going to start Cache_Dirs, give it a few minutes to populate the cache; and then fire up a parity check to see how it compares to the very-consistent 7:41 it's been taking.

garycase · June 23, 2013

Okay, I'll post back in ~ 8 hrs with the results. Started Cache_Dirs; gave it 20 minutes to ensure it had ample time to populate the cache; cleared the statistics; did a Spinup all drives; and started a parity check at exactly 10:30 local time.

I've done a LOT of parity checks recently, as I "tuned" my parameters ... and it now runs almost exactly 7:41 => so in 8 hours I'll know for sure whether Cache_Dirs has any impact on that time. Just to ensure there's no impact from excessive refreshes of the Web GUI, I don't plan to even look at it until 8 hours have passed

garycase · June 23, 2013

Question for Joe ==> I'm not going to try this while my parity check is running, but if I was to access the share via Windows Explorer to look at the directories, does that cause any activity that might change the impact of Cache_Dirs ??

I'd assume not ... i.e. I should be able to browse the folders with ZERO impact on the ongoing parity test and Cache_Dirs functionality (as long as I don't actually open any of the files) => but am interested in whether you agree.

Joe L. · June 23, 2013

Question for Joe ==> I'm not going to try this while my parity check is running, but if I was to access the share via Windows Explorer to look at the directories, does that cause any activity that might change the impact of Cache_Dirs ??

I'd assume not ... i.e. I should be able to browse the folders with ZERO impact on the ongoing parity test and Cache_Dirs functionality (as long as I don't actually open any of the files) => but am interested in whether you agree.

It would depend on if your explorer is indexing the files, creating thumbnail images, or not. In any case, using file-explorer to simply list the files is exactly the same as a "find" command in cache-dirs. It will have a minimal effect on a parity check (barely noticeable, if at all)

garycase · June 23, 2013

It would depend on if your explorer is indexing the files, creating thumbnail images, or not. In any case, using file-explorer to simply list the files is exactly the same as a "find" command in cache-dirs. It will have a minimal effect on a parity check (barely noticeable, if at all)

Thanks. I had assumed as much, but just wanted to confirm that I wasn't overlooking something.

If this parity check completes in the same 7:41 it normally takes, I'll be (pleasantly) surprised !! Maybe Cache_Dirs doesn't need to suspend itself after all !!

It just SEEMS like all those buffered reads during a parity check would impact the cache buffers and result in a LOT of extra Finds that would thrash the disks a bunch and add a modest number of minutes to the process ... but Robj's results certainly didn't indicate that was the case.

WeeboTech · June 23, 2013

It just SEEMS like all those buffered reads during a parity check would impact the cache buffers and result in a LOT of extra Finds that would thrash the disks a bunch and add a modest number of minutes to the process ... but Robj's results certainly didn't indicate that was the case.

There are a number of different buffers being used here.

The buffer cache buffers filesystem and file data.

The dentry cache buffers directory entry filesystem data.

The md/unraid buffers.

From what I know the dentry and md/unraid buffers are in low memory.

The do not normally impact one another.

If there is no cache pressure, then the dentry cache will exist until the dentry expires.

Normally you wont see an impact unless you have so many files (like I did) that they cannot all be stored in the dentry cache. Then the dentry cache starts dropping entries to hold newer entries.

As this happens you start reading directories from the disk instead of the hash table. That's when it could impact parity checks.

garycase · June 23, 2013

It just SEEMS like all those buffered reads during a parity check would impact the cache buffers and result in a LOT of extra Finds that would thrash the disks a bunch and add a modest number of minutes to the process ... but Robj's results certainly didn't indicate that was the case.

There are a number of different buffers being used here.

The buffer cache buffers filesystem and file data.

The dentry cache buffers directory entry filesystem data.

The md/unraid buffers.

From what I know the dentry and md/unraid buffers are in low memory.

The do not normally impact one another.

If there is no cache pressure, then the dentry cache will exist until the dentry expires.

Normally you wont see an impact unless you have so many files (like I did) that they cannot all be stored in the dentry cache. Then the dentry cache starts dropping entries to hold newer entries.

As this happens you start reading directories from the disk instead of the hash table. That's when it could impact parity checks.

A couple questions ...

(a) How many files do you have? [ballpark]

(b) How can I tell what the current cache pressure setting is? I read somewhere that UnRAID defaults this to 60 ... is that no longer true?

cache_dirs - an attempt to keep directory entries in RAM to prevent disk spin-up

Recommended Posts

Link to comment

Top Posters In This Topic

Popular Days

Top Posters In This Topic

Popular Days

Popular Posts

Alex R. Berg

Alex R. Berg

Alex R. Berg

Posted Images

Link to comment

Link to comment

Link to comment

Link to comment

Link to comment

Link to comment

Link to comment

Link to comment

Link to comment

Link to comment

Link to comment

Link to comment

Link to comment

Link to comment

Link to comment

Link to comment

Link to comment

Link to comment

Link to comment

Link to comment

Link to comment

Link to comment

Link to comment

Link to comment

Join the conversation