Caching, Cache Pools - best-practice setup?

boomam · February 15, 2020

Hi,

Today i merged my unassigned SSD/data back into the main pool, and now have 2x 1Tb drives in a SSD cache pool for a 3x 4Tb HDD array.

What i'm trying to work out is the best cache settings to use, whilst also minimizing HDD drive-spin up, and SSD write usage too (obviously depends on the apps at play).

In an ideal world, other than when i'm pulling data (read), everything will go to the SSD, and the HDD's will only spin up when the mover kicks in.

At the moment, i have ALL my shares, other than system & appdata, set to 'yes' for caching, with system & appdata set to 'prefer'.

With 25-30 containers active, including ones collecting logs (influx/telegraf for example), and the syslog looping back to write to a log share, i'm seeing a few hundred KB/s of writes at any given moment, with the occasional spike up to a few MB/s.

However, i'm also seeing the HDDs spin up often too.

If i manually spin-down the HDDs, they stay off for a little while, but do end up spinning up again.

I am guessing that perhaps some data is being pulled by one of the containers that's not in the cache yet - perhaps it'll get better the longer its in this configuration?

As some questions -

1. What is the best practice for cache usage?

2. Is there a way to have the cache somewhat mirror the data array (to a point) and the mover literally just be a copy to ensure parity/protection from the HDDs?

3. Is there a better way to see what files are writing at any given moment to either the array or the cache?

As some of the plugins, such as 'Open Files' are next to useless for diagnosing due to the paths being listed being the abstracted paths, instead of the actual paths - as in it will list /user/XYZ instead of whether its cache or disk being written or read too.

Thanks!

trurl · February 15, 2020

6 minutes ago, boomam said:

I am guessing that perhaps some data is being pulled by one of the containers that's not in the cache yet

This suggests some confusion about how cache works in Unraid. Data is written to cache for cache-yes and cache-prefer shares. Cache-yes shares are moved from cache to array when mover runs, and cache-prefer shares are moved array to cache when mover runs. Other than that, there is no "pulling" data "that's not in the cache yet". If the data is already on the array, it won't ever go to cache unless it is in a cache-prefer share and mover moves it there.

boomam · February 15, 2020

ok - so for cache-prefer shares - does the data that exists in cache ever get written back to the array?

itimpi · February 15, 2020

21 minutes ago, boomam said:

ok - so for cache-prefer shares - does the data that exists in cache ever get written back to the array?

No. If you want the data for such shares to also be on the array you need to set up appropriate backup jobs (e.g. the CA Backup plugin).

boomam · February 15, 2020

Ok.

So what is the actual real world advantage of cache-prefer then?

Doc's say it copies data to the cache from the array (unless there's no room), and i assume R/W's from the cache - after a point, what is the mover even doing if its not writing back to the array at some point?

Comparatively, cache-yes seems to be cache writes, then move to array, with reads coming from the HDDs.

So theres truly no way to not have the HDD's spin up at every little thing happening? That surely cannot be right?

boomam · February 15, 2020

Reading around some more, if i want the VMs/Containers to be high speed, they should be cache-only, and have some backup routines running on them to move to the array - correct?

Equally, im going down this route for each setting -

Cache Only

For items i want read/write caching and have other backup routines running

Appdata
Domains

Cache Prefer

For items that i want to read from the cache, but the writes be flushed to the array by the mover.

System

Cache Yes

For items that i want to write to the cache, with writes be flushed to the array by the mover, but read caches until its been moved

Downloads
General Storage
Logs

Cache Np

For items that i want to read & write from the array only.

Backups

HDD spin up concerns aside, does the 'plan' above seem like somewhat good practice?

Edited February 15, 2020 by boomam
More info added.

trurl · February 16, 2020

19 minutes ago, boomam said:

if i want the VMs/Containers to be high speed, they should be cache-only, and have some backup routines running on them to move to the array - correct?

The CA Backup plugin as mentioned will archive these to the array.

31 minutes ago, boomam said:

what is the actual real world advantage of cache-prefer then

Cache-prefer can overflow to the array if cache runs out of space. Best if you just don't allow that to happen though. Might make some sense for domains or appdata, but system really needs to stay on cache and it probably can't be moved anyway.

13 minutes ago, boomam said:

Cache Prefer

For items that i want to read from the cache, but the writes be flushed to the array by the mover.

System

As mentioned this should stay on cache, and cache prefer will try to keep it there, but cache-only might be better. It is unlikely mover would be able to move these anyway since this share contains your docker and libvirt images and mover can't move open files.

You might do Compute All on the User Shares page to make sure appdata, domains, and system are indeed all on cache where then need to be.

boomam · February 16, 2020

22 minutes ago, trurl said:

You might do Compute All on the User Shares page to make sure appdata, domains, and system are indeed all on cache where then need to be.

Good shout on that button, easy way to check.

I now have system, appdata & domains set to be cache-only.

I'm reasonably happy with the layout mentioned above, my only issue now is that i cant get my HDDs to stay spun down. The aforementioned tools i got from CA are next to useless for diagnosing why the drives immediately spin up after spinning down.

Shutting down all containers & VMs, still shows the drives coming straight back online...hmm...

Any ideas?

Edited February 16, 2020 by boomam

boomam · February 16, 2020

Disabling the syslog server seems to have kept the drives spun down, even though the associated share was set to cache=yes...hmm...

- Edit -

Odd, spoke too soon, restarted the docker service and all of a sudden i can no longer spin down the disks again.

Edited February 16, 2020 by boomam

trurl · February 16, 2020

9 minutes ago, boomam said:

Disabling the syslog server seems to have kept the drives spun down, even though the associated share was set to cache=yes...hmm...

If the logs were already on the array then it will update those.

boomam · February 16, 2020

7 minutes ago, trurl said:

If the logs were already on the array then it will update those.

Thats unusual behaviour - perhaps a reason why im seeing spin ups of the disks.

Is there a reliable way to diagnose/tidy up?

For example, i just by chance saw that domains still existed on disk1 for some reason, used CLI to move it to cache.

trurl · February 16, 2020

1 hour ago, boomam said:

Thats unusual behaviour - perhaps a reason why im seeing spin ups of the disks.

Is there a reliable way to diagnose/tidy up?

For example, i just by chance saw that domains still existed on disk1 for some reason, used CLI to move it to cache.

Not unusual. Syslog is a single file, at least until it rotates. If that single file already exists on a disk then updates to that file will be to the already existing file.

As for domains, if it had files on the array, and those files were in use, then mover couldn't move them. On the other hand, if you set that share to cache-only when it already had files on the array, then mover wouldn't try to move them, since it only moves cache-prefer or cache-yes shares.

See this FAQ for a more complete understanding of the use cache settings:

https://forums.unraid.net/topic/46802-faq-for-unraid-v6/page/2/#comment-537383

boomam · February 16, 2020

Thanks, read it a few times, but the FAQ is typical linux style though - not great, but the accepted norm

Once i get some time in a few weeks i want to work out how to contribute to get it clearer/dumbed down.

I'm going through containers one by one, and i'm able to duplicate drive spin up's, in addition to the logs one, with both Calibre & NextCloud, even though both their shares are cache-yes.

A few away from testing my 30-odd containers, then i can loop back and diagnose.

trurl · February 16, 2020

2 minutes ago, boomam said:

I'm going through containers one by one, and i'm able to duplicate drive spin up's, in addition to the logs one, with both Calibre & NextCloud, even though both their shares are cache-yes.

A cache-yes share will have most of its contents on the array and only new writes in cache until they get moved to the array.

boomam · February 16, 2020

Good to know.

I can duplicate with Plex too, same scenario i guess....only during clean up earlier i accidentally ran 'rm -r' on my media share. 😞

boomam · February 16, 2020

The thing that's throwing me off right now is the drive spin ups with nextcloud and Calibre.

When I had my docker.img file on an unassigned drive, I didn't have spin up issues.

Also, nextcloud/calobre isn't in constant usage either. So I'm a little stumped what's causing the spin ups.

trurl · February 16, 2020

Go to Tools - Diagnostics and attach the complete diagnostics zip file to your NEXT post.

boomam · February 16, 2020

Attached.

I rebuilt NextCloud (as i had nothing in it anyway) from the NCP image, to one with 2x containers (MariaDB + LinuxServer/NextCloud) and thus far its NOT spinning the drive up for just being started.

Just Calibre to work out now, followed by Plex.

unraid1.zip

trurl · February 16, 2020

Your diagnostics has 2 files in the shares folder anonymized as l--s

Due to DOS/Windows naming conventions, where upper/lower case isn't significant, one of these has (1) appended so they won't have the same filename.

One of these has settings for a share that doesn't actually exist. The other share does exist but doesn't have any settings.

This is often caused by a user accidentally creating a share by specifying the wrong upper/lower case in a docker mapping or other path. Linux is case sensitive. Any folder at the top level of cache or array is automatically a user share, even if you didn't create it as a user share in the webUI.

I don't know what that share is named since it is anonymized, but you need to clean that up, since the cfg file that exists in config/shares on your flash drive doesn't actually correspond to the share due to the upper/lower case problem, and the actual share itself has no corresponding .cfg file, which means it has default settings. The default Use cache setting is No.

boomam · February 16, 2020

ok,

How am i cleaning up missing shares? Is there a guide somewhere?

The directories do not exist on the file system, so where exactly are they located otherwise in the context of Unraid?

-- EDIT --

Looking in /boot/config/shares, i can see 2x issues

1. it lists a bitwarden share that doesnt exist.

2. It lists a log share with a 'L', whereas the actual share uses 'l'.

Is this what i'm tidying up in some way?

Am i just deleting one and editing the other?

-- EDIT 2 --

The bitwarden one removes without issue.

The Logs one will not rename as it thinks its the same name - odd for Linux.

Edited February 16, 2020 by boomam

trurl · February 16, 2020

The bitwarden cfg file will be leftover from when the share did exist. You can delete that or not. It isn't used since the share doesn't exist. I prefer to not have these cluttering up the diagnostics.

I assume the log share you refer to is the one anonymized as l--s that I mentioned. The 'L' cfg file is settings for a share that doesn't exist, as I mentioned. There is no corresponding cfg file for the actual share that begins with 'l', so that share has default settings.

Renaming that 'L' cfg file should work, if you rename it in some OS that respects upper/lower case. Do you use Krusader or Midnight Commander on your server? Those would let you do the rename directly on your server with it still running, then you would have to stop and restart the array to get the shares restarted with those settings.

Another possibility is to just delete that 'L' cfg file and make new settings for the 'l' share.

boomam · February 16, 2020

I use SSH to move things around, im more comfortable with CLI.

I'll nuke the share in the GUI, then CLI, then re-create.

-- EDIT --

Share deleted and re-created - /boot/config/shares now matches up correctly, inc. in names.

Do you have any theories as to why Calibre/the books share is causing drive spin ups?

Also, i'm concerned about NextCloud a little, after the mover has run...i'm wondering if it's worth resetting it up and trying to split the /data folder out.

Edited February 16, 2020 by boomam

trurl · February 16, 2020

1 minute ago, boomam said:

Do you have any theories as to why Calibre/the books share is causing drive spin ups?

I don't use that container. Does it have mappings to any user shares with any contents on the array? Are you sure it is the reason for the spinups?

4 minutes ago, boomam said:

i'm concerned about NextCloud a little after the mover has run...im wondering if it's worth resetting it up and trying to split the /data folder out.

I have NextCloud, my /data is mapped to a cache-yes user share. How are you doing it?

boomam · February 16, 2020

1 minute ago, trurl said:

I don't use that container. Does it have mappings to any user shares with any contents on the array? Are you sure it is the reason for the spinups?

I have NextCloud, my /data is mapped to a cache-yes user share. How are you doing it?

RE: Calibre

Both its library and import docker links go to a 'books' share, that is set to cache-yes

Re: NextCloud

The container path for /data goes to a share called 'NextCloud', that is set to cache-yes

Thank you for your input thus far btw. Appreciate the effort.

trurl · February 16, 2020

2 minutes ago, boomam said:

RE: Calibre

Both its library and import docker links go to a 'books' share, that is set to cache-yes

Re: NextCloud

The container path for /data goes to a share called 'NextCloud', that is set to cache-yes

Thank you for your input thus far btw. Appreciate the effort.

20 hours ago, trurl said:

A cache-yes share will have most of its contents on the array and only new writes in cache until they get moved to the array.

Caching, Cache Pools - best-practice setup?

Recommended Posts

Link to comment

Link to comment

Link to comment

Link to comment

Link to comment

Link to comment

Link to comment

Link to comment

Link to comment

Link to comment

Link to comment

Link to comment

Link to comment

Link to comment

Link to comment

Link to comment

Link to comment

Link to comment

Link to comment

Link to comment

Link to comment

Link to comment

Link to comment

Link to comment

Link to comment

Join the conversation