Smart caching script


Recommended Posts

Hello all,

 

Background

I have a 8 mechanical HDD 40TB array in a Unraid server v6.8.2 DVB.
I have 40GB memory, only about 8GB are used. I don't use memory for caching, for now.

I have a 2TB SSD mounted as cache and hosting docker appdata and VM domains, and until now I was using Unraid cache system to store only new files from a data share, with a script moving to array when SSD was 90% full. With this method, only latest written file were on the cache, so I rethought the whole thing, see below.

I use plex to stream to several devices on LAN (gigabit ethernet) or WAN (gigabit fiber internet), and also seed torrents with transmission.

 

Here is my share setup:

image.thumb.png.6d74cd225525368b2cb79f5c2646761d.png

 

So I wanted to dynamically cache file from data share to SSD. Main file consumers are plex and transmission, which have their data in a data share

As a fail-safe, I set mover to only move file if cache usage is more than 95%.

I wrote a script to handle automagically caching of the data share to use SSD up to 90% (including appdata and VMs).

 

What the script needs:

  • a RPC enabled transmission installation (optional)
  • access to Plex web API (optional)
  • path to a share on cache
  • path to same share on array

 

What the script do:

When you start it, it makes basic connection and path checks and then 3 main functions are executed:

  1. Cleans selected share on cache to have at least 10% free (configurable). To free space, oldest data are copied back to array then deleted from cache.
  2. Retrieves the list of active torrents from transmission-rpc daemon and copy to cache without removing from array.
    (note: active-torrents are those downloading and seeding during the last minute, but also starting and stopping, that's a caveat if you start/stop a batch of torrent and launch the script in the minute)
  3. Retrieves the list of active playing sessions from Plex and copy (rsync, same as mover or unbalance) movies to cache without removing from array. 
    For series, there are options to copy:
    • current and next episode
    • or all episodes from current to the end of season
  4. Cleans again 

Note:

  • to copy, rsync is used, like mover or unbalance, so it syncs data (don't overwrite if existing)
  • in addition, hard-links, if any (from radarr, sonarr, etc...), are recreated on destination (cache when caching, array when cleaning cache)
  • if you manually send file to the share on cache, it will be cleaned when getting old :) you may write a side script then (for working files, libraries, etc..)

 

Because of shfs mechanism accessing a file from /mnt/user will read/write fro cache if it exists, then from array. Duplicate data are not a problem and globally speed up things.

 

The script is very useful when, like me, you have noisy/slow mechanical HDDs for storage, and quick and quiet SDD to serve files.

 

Script installation:

I recommend copy/paste it in a new script created with User Scripts. 

 

Script configuration:

No parameters are passed to the script, so it's easy to use with User Scripts plugin.

To configure, a relevant section is at the beginning of the script, parameters are pretty much self explanatory:

image.png.72a81ea60071e281b115841dd0d4d9f6.png

Here is a log from execution:

image.thumb.png.5a027d970e4e2a81671d5e3fabe05d34.png

Pretty neat hum?

 

Known Previous issues (update may came to fix them later):

  • At the moment, log can become huge if, like me, you run the script every minute. This is the recommended interval because transmission-RPC active torrent list contain only the ones from last minute. 
    Edit 13-02-2020: corrected in latest version
  • At the moment, a orphan file (only on cache) being played or seeded is detected, but not synced to the array until it needs to be cleaned (i.e: fresh torrents, recent movies fetched by *arr and newsgroup, etc...). 
    Edit 13-02-2020: corected in latest version: it sync back to array during set day (noisy) hours.
  • I don't know if/how shfs will handle the file on cache. I need more investigation/testing to see if it efficiently read the file from cache instead of array. I guess transmission/plex need to close and reopen the file to pick it from new location? (my assumption is that both read chunks, so caching shall work).
    Edit 13-02-2020: yes, after checking with File Activity plugin, that's the case and its plex/transmission take the file on cache as soon as it is available!

 

Conclusion, disclaimer, and link:

The script run successfully in my configuration since yesterday. Before using rsync I was using rclone which has a cache back-end, a similar plex caching function, plus a remote (I used it for transmission), but it's not as smooth and quick as rsync.

Please note that, even if I use it 1440 times a day (User Scripts, Custom schedule * * * * *), this script is still experimental and can:

  • erase (or most likely fill up) your SSD,
    Edit 13-02-2020: but I did not experienced this, error handling improved
  • erase (not likely) your array
    Edit 13-02-2020: Technically, this script never delete anything on array, it won't happen
  • kill your cat (sorry)
  • make your mother in law move to your home and stay (I can do nothing for that)
  • break your server into pieces (you can keep these)

Thanks for reading to this point, you deserve to get the link to the script (<- link is here).

 

If you try it or have any comment, idea, recommendation, question, etc..., feel free to reply ;)

 

Take care,

Reynald

 

Edited by Reynald
Correct link
Link to comment

Hello all,

 

I updated the script 2 days ago, it's holding tight !

 

I have very very few spinup now, because 1.4To of most recent data are duplicated on SSD.

 

Quote

- Added max number of torrent/sessions for transmission/plex, preventing caching all torrents at transmission startup/torrent checks
- Added syncing new files on cache to storage during day hours
- Improved logging
- Improved error handling for caching/uncaching function

 

It's on my github: https://bit.ly/Ro11u5-GH_smart-cache

 

Shall I make this a plugin?

Link to comment

Updated: v0/5/14:

- Improvement on verbosity (new settings)
- Added  parameter CACHE_MAX_FREE_SPACE_PCT="85" in addition to CACHE_MIN_FREE_SPACE_PCT="90"
=> When cache usage exceed CACHE_MIN_FREE_SPACE_PCT  (here 90%), it is freed until CACHE_MAX_FREE_SPACE_PCT is achieved, here 85%

Link to comment

Nice script!  Love the Plex section of it.

 

Question, why are you using rsync instead of the mover provided binary?

 

The binary just needs a file list.  To move from cache to disk the file list needs contain the /mnt/cache/<SHARENAME>... and for array to cache /mnt/disk<##>/<SHARENAME>

 

Example call from the mover bash script.

  # Check for objects to move from array to cache
  for SHAREPATH in $(ls -dv /mnt/disk[0-9]*/*/) ; do
    SHARE=$(basename "$SHAREPATH")
    if grep -qs 'shareUseCache="prefer"' "/boot/config/shares/${SHARE}.cfg" ; then
      find "${SHAREPATH%/}" -depth | /usr/local/sbin/move -d $LOGLEVEL
    fi
  done
 

 

Link to comment

Hello,

Thank you for your interest and warm words @hugenbdd ! This script takes me quiet some hours of thinking/scripting.

 

2 hours ago, hugenbdd said:

Question, why are you using rsync instead of the mover provided binary?

I was not aware of mover binary. 

If I'm not mistaken the /usr/local/sbin/mover.old script where you find your piece of script for your example was invoking rsync in the past. I recall having picked rsync options (-aH) from this mover.old script

 

My strategy is not to move, but to archive-sync in both direction (same as mover), and to delete from cache depending on disk usage, not deleting on array.
Some benefits:

- File is not overwritten if identical, latest copy is on cache if it exists on cache.

--> Moving from cache to array and vice-versa will take more time than duplicating data (mover will also not move, but sync and delete).

--> Copying from array to cache let the data secured by parity

--> Having control on deletion allows to handle hardlinks (a torrent seeded by transmission from cache is also available for plex). Mover will preserve them also as it move a whole directory, but I'm moving files.

--> I can bypass cache "prefer/yes/no/only" directives, and set mover so it won't touch my "data" share until I'm short on space on cache(i.e if this smart-cache script is killed).

--> Using rsync with -P parameters while debugging/testing give some status/progress info 

Drawbacks: 

- data is duplicated

- deletion and modification from array using /mnt/user0 or /mnt/diskN is not synced to /mnt/cache. This is not possibble if we use /mnt/user for the 'data' share.

 

But thanks to your suggestion (with the filelist idea), I have an idea about how to sync cache only files (i.e fresh transmission downloads during quiet hours) to array.
Also, mover may do some extra checks from array to cache. From cache to array, I use unraid shfs mechanism as I sync to /mnt/user0 (and not to /mnt/diskN), same for hardlink that are well handled by shfs.

 

If you want to use this script for plex only, you can:

- set $TRANSMISSION_ENABLED to false

or if you want to clean the script:

- remove #Transmission parameters, transmission_cache fonction and its call ('$TRANSMISSION_ENABLED && transmission_cache') at the bottom of the script.
I may extend to other torrent client later.

Edited by Reynald
  • Like 2
Link to comment
  • 2 years later...

great script, can you modify your script to add video preloading in ram like in this script (with this thing videos cached by your scipt would start instantaneously):

 

But it would be only for movies and tv shows cached on ssd by your script, with this addition your script will be the ultimate solution for caching and speeding up plex !

e.g. : It would be nice to have minimum parameters :
MAX_FREE_RAM_USAGE_PCT (percent of free ram used for preload)
VIDEO_MAX_PRELOAD_PCT (would be 1% or 2% by default)

 

Edited by doobyns
Link to comment
18 hours ago, kizer said:

I tried using this, but I get a path not found. 

 

I created the data share on my system, but it seems to delete it every time I run it. 

 

I'm running 6.10.3

can you post here your settings for :
STORAGE_PATH
CACHE_PATH
CACHE_DISK

Link to comment
On 6/20/2022 at 8:20 PM, kizer said:

STORAGE_PATH="/mnt/user0/TV/Shows/"

CACHE_PATH="/mnt/cache/data/"
CACHE_DISK="/mnt/cache/data/"

 

I noticed you had CACHE_DISK= /dev/mapper/ and a NVE device. I only have /dev/mapper/control 

what do you obtain wheh you type this :
 

df -h /mnt/cache/

 

Link to comment
52 minutes ago, kizer said:

df -h /mnt/cache/

 

Filesystem   Size     Used    Avail    Use%  Mounted on

/dev/sdh1     932G   153G   779G   20%    /mnt/cache

 

 

 

Try this :
CACHE_DISK="/dev/sdh1"

Link to comment

Is my Storage Path Correct for Plex to read from? Its the current location of my TV shows when they are temporary before the mover moves them.  Meaning I use /mnt/user/TV for my Tv shows and /mnt/user/Movies for movies. 

 

I'm not as worried about my Movies as much as my TV shows. 

 

I'll give this a try tonight. I'm currently at work and thank you for your time to look at this for me. 

Link to comment
1 hour ago, doobyns said:

CACHE_DISK="/dev/sdh1"

That is not a permanent fix even if it works at the moment, it may not work on subsequent boots, as the sdX designations are subject to change based on many factors out of Unraid's control.

Link to comment

@JonathanM @trurl

 

Totally understand. I'm trying to peak under the curtain some. ;)

 

I'm trying to figure out what the magic is here since it has me really intrigued but I want to make sure there isn't anything that would cause my system or somebody else's system to go haywire.  Also if it does what its supposed to it would put a pretty cool spin on what I have going on at my place. 

Link to comment

Ok newest update. 

 

        #Rsync path
        STORAGE_PATH="/mnt/user0/TV/"
        CACHE_PATH="/mnt/cache/TV/"
        CACHE_DISK="/dev/sdh1"
        CACHE_MIN_FREE_SPACE_PCT="90"
        CACHE_MAX_FREE_SPACE_PCT="85"

 

Delivers:

Welcome to /tmp/user.scripts/tmpScripts/Plex-Caching-Script/script
stat: cannot statx '/tmp/user.scripts/tmpScripts/Plex-Caching-Script/log.txt': No such file or directory
Info: Log size is

----------------------------
1 active(s) plex session(s):
----------------------------
- 1/1: Serie: Suits Season 9 - Episode 4/-1: Cairo

---------------------
Cache disk usage: 16%
---------------------
- Info: 16% space used, quota is 90%, nothing to do
-- Info: Cleaning empty directories...

 

It just doesn't cache anything thou. 

Link to comment

Join the conversation

You can post now and register later. If you have an account, sign in now to post with your account.
Note: Your post will require moderator approval before it will be visible.

Guest
Reply to this topic...

×   Pasted as rich text.   Restore formatting

  Only 75 emoji are allowed.

×   Your link has been automatically embedded.   Display as a link instead

×   Your previous content has been restored.   Clear editor

×   You cannot paste images directly. Upload or insert images from URL.