Jump to content
Reynald

Smart caching script

5 posts in this topic Last Reply

Recommended Posts

Posted (edited)

Hello all,

 

Background

I have a 8 mechanical HDD 40TB array in a Unraid server v6.8.2 DVB.
I have 40GB memory, only about 8GB are used. I don't use memory for caching, for now.

I have a 2TB SSD mounted as cache and hosting docker appdata and VM domains, and until now I was using Unraid cache system to store only new files from a data share, with a script moving to array when SSD was 90% full. With this method, only latest written file were on the cache, so I rethought the whole thing, see below.

I use plex to stream to several devices on LAN (gigabit ethernet) or WAN (gigabit fiber internet), and also seed torrents with transmission.

 

Here is my share setup:

image.thumb.png.6d74cd225525368b2cb79f5c2646761d.png

 

So I wanted to dynamically cache file from data share to SSD. Main file consumers are plex and transmission, which have their data in a data share

As a fail-safe, I set mover to only move file if cache usage is more than 95%.

I wrote a script to handle automagically caching of the data share to use SSD up to 90% (including appdata and VMs).

 

What the script needs:

  • a RPC enabled transmission installation (optional)
  • access to Plex web API (optional)
  • path to a share on cache
  • path to same share on array

 

What the script do:

When you start it, it makes basic connection and path checks and then 3 main functions are executed:

  1. Cleans selected share on cache to have at least 10% free (configurable). To free space, oldest data are copied back to array then deleted from cache.
  2. Retrieves the list of active torrents from transmission-rpc daemon and copy to cache without removing from array.
    (note: active-torrents are those downloading and seeding during the last minute, but also starting and stopping, that's a caveat if you start/stop a batch of torrent and launch the script in the minute)
  3. Retrieves the list of active playing sessions from Plex and copy (rsync, same as mover or unbalance) movies to cache without removing from array. 
    For series, there are options to copy:
    • current and next episode
    • or all episodes from current to the end of season
  4. Cleans again 

Note:

  • to copy, rsync is used, like mover or unbalance, so it syncs data (don't overwrite if existing)
  • in addition, hard-links, if any (from radarr, sonarr, etc...), are recreated on destination (cache when caching, array when cleaning cache)
  • if you manually send file to the share on cache, it will be cleaned when getting old :) you may write a side script then (for working files, libraries, etc..)

 

Because of shfs mechanism accessing a file from /mnt/user will read/write fro cache if it exists, then from array. Duplicate data are not a problem and globally speed up things.

 

The script is very useful when, like me, you have noisy/slow mechanical HDDs for storage, and quick and quiet SDD to serve files.

 

Script installation:

I recommend copy/paste it in a new script created with User Scripts. 

 

Script configuration:

No parameters are passed to the script, so it's easy to use with User Scripts plugin.

To configure, a relevant section is at the beginning of the script, parameters are pretty much self explanatory:

image.png.72a81ea60071e281b115841dd0d4d9f6.png

Here is a log from execution:

image.thumb.png.5a027d970e4e2a81671d5e3fabe05d34.png

Pretty neat hum?

 

Known Previous issues (update may came to fix them later):

  • At the moment, log can become huge if, like me, you run the script every minute. This is the recommended interval because transmission-RPC active torrent list contain only the ones from last minute. 
    Edit 13-02-2020: corrected in latest version
  • At the moment, a orphan file (only on cache) being played or seeded is detected, but not synced to the array until it needs to be cleaned (i.e: fresh torrents, recent movies fetched by *arr and newsgroup, etc...). 
    Edit 13-02-2020: corected in latest version: it sync back to array during set day (noisy) hours.
  • I don't know if/how shfs will handle the file on cache. I need more investigation/testing to see if it efficiently read the file from cache instead of array. I guess transmission/plex need to close and reopen the file to pick it from new location? (my assumption is that both read chunks, so caching shall work).
    Edit 13-02-2020: yes, after checking with File Activity plugin, that's the case and its plex/transmission take the file on cache as soon as it is available!

 

Conclusion, disclaimer, and link:

The script run successfully in my configuration since yesterday. Before using rsync I was using rclone which has a cache back-end, a similar plex caching function, plus a remote (I used it for transmission), but it's not as smooth and quick as rsync.

Please note that, even if I use it 1440 times a day (User Scripts, Custom schedule * * * * *), this script is still experimental and can:

  • erase (or most likely fill up) your SSD,
    Edit 13-02-2020: but I did not experienced this, error handling improved
  • erase (not likely) your array
    Edit 13-02-2020: Technically, this script never delete anything on array, it won't happen
  • kill your cat (sorry)
  • make your mother in law move to your home and stay (I can do nothing for that)
  • break your server into pieces (you can keep these)

Thanks for reading to this point, you deserve to get the link to the script (<- link is here).

 

If you try it or have any comment, idea, recommendation, question, etc..., feel free to reply ;)

 

Take care,

Reynald

 

Edited by Reynald
Correct link

Share this post


Link to post

Hello all,

 

I updated the script 2 days ago, it's holding tight !

 

I have very very few spinup now, because 1.4To of most recent data are duplicated on SSD.

 

Quote

- Added max number of torrent/sessions for transmission/plex, preventing caching all torrents at transmission startup/torrent checks
- Added syncing new files on cache to storage during day hours
- Improved logging
- Improved error handling for caching/uncaching function

 

It's on my github: https://bit.ly/Ro11u5-GH_smart-cache

 

Shall I make this a plugin?

Share this post


Link to post

Updated: v0/5/14:

- Improvement on verbosity (new settings)
- Added  parameter CACHE_MAX_FREE_SPACE_PCT="85" in addition to CACHE_MIN_FREE_SPACE_PCT="90"
=> When cache usage exceed CACHE_MIN_FREE_SPACE_PCT  (here 90%), it is freed until CACHE_MAX_FREE_SPACE_PCT is achieved, here 85%

Share this post


Link to post

Nice script!  Love the Plex section of it.

 

Question, why are you using rsync instead of the mover provided binary?

 

The binary just needs a file list.  To move from cache to disk the file list needs contain the /mnt/cache/<SHARENAME>... and for array to cache /mnt/disk<##>/<SHARENAME>

 

Example call from the mover bash script.

  # Check for objects to move from array to cache
  for SHAREPATH in $(ls -dv /mnt/disk[0-9]*/*/) ; do
    SHARE=$(basename "$SHAREPATH")
    if grep -qs 'shareUseCache="prefer"' "/boot/config/shares/${SHARE}.cfg" ; then
      find "${SHAREPATH%/}" -depth | /usr/local/sbin/move -d $LOGLEVEL
    fi
  done
 

 

Share this post


Link to post
Posted (edited)

Hello,

Thank you for your interest and warm words @hugenbdd ! This script takes me quiet some hours of thinking/scripting.

 

2 hours ago, hugenbdd said:

Question, why are you using rsync instead of the mover provided binary?

I was not aware of mover binary. 

If I'm not mistaken the /usr/local/sbin/mover.old script where you find your piece of script for your example was invoking rsync in the past. I recall having picked rsync options (-aH) from this mover.old script

 

My strategy is not to move, but to archive-sync in both direction (same as mover), and to delete from cache depending on disk usage, not deleting on array.
Some benefits:

- File is not overwritten if identical, latest copy is on cache if it exists on cache.

--> Moving from cache to array and vice-versa will take more time than duplicating data (mover will also not move, but sync and delete).

--> Copying from array to cache let the data secured by parity

--> Having control on deletion allows to handle hardlinks (a torrent seeded by transmission from cache is also available for plex). Mover will preserve them also as it move a whole directory, but I'm moving files.

--> I can bypass cache "prefer/yes/no/only" directives, and set mover so it won't touch my "data" share until I'm short on space on cache(i.e if this smart-cache script is killed).

--> Using rsync with -P parameters while debugging/testing give some status/progress info 

Drawbacks: 

- data is duplicated

- deletion and modification from array using /mnt/user0 or /mnt/diskN is not synced to /mnt/cache. This is not possibble if we use /mnt/user for the 'data' share.

 

But thanks to your suggestion (with the filelist idea), I have an idea about how to sync cache only files (i.e fresh transmission downloads during quiet hours) to array.
Also, mover may do some extra checks from array to cache. From cache to array, I use unraid shfs mechanism as I sync to /mnt/user0 (and not to /mnt/diskN), same for hardlink that are well handled by shfs.

 

If you want to use this script for plex only, you can:

- set $TRANSMISSION_ENABLED to false

or if you want to clean the script:

- remove #Transmission parameters, transmission_cache fonction and its call ('$TRANSMISSION_ENABLED && transmission_cache') at the bottom of the script.
I may extend to other torrent client later.

Edited by Reynald

Share this post


Link to post

Join the conversation

You can post now and register later. If you have an account, sign in now to post with your account.
Note: Your post will require moderator approval before it will be visible.

Guest
Reply to this topic...

×   Pasted as rich text.   Restore formatting

  Only 75 emoji are allowed.

×   Your link has been automatically embedded.   Display as a link instead

×   Your previous content has been restored.   Clear editor

×   You cannot paste images directly. Upload or insert images from URL.