HDD Auto spinup on Plex container activity


mgutt

Recommended Posts

This script automatically spins up all defined Disks to remove spin up latency on playback. It should be executed through CA User Scripts on array startup. This script is inspired by @MJFOx's version. But instead checking the Plex log file, which needs to enable Debugging, this script monitors the CPU load of the Plex container (which rises if a Plex Client has been started):

#!/bin/bash

# make script race condition safe
if [[ -d "/tmp/${0///}" ]] || ! mkdir "/tmp/${0///}"; then exit 1; fi
trap 'rmdir "/tmp/${0///}"' EXIT

# ######### Settings ##################
spinup_disks='1,2,3,4,5,6,7' # Note: Usually parity disks aren't needed for Plex
cpu_threshold=1 # Disks spin up if Plex container's CPU load exceeds this value
# #####################################
# 
# ######### Script ####################
while true; do
    plex_cpu_load=$(docker stats --no-stream | grep -i plex | awk '{sub(/%/, "");print $3}')
    if awk 'BEGIN {exit !('$plex_cpu_load' > '$cpu_threshold')}'; then
        echo "Container's CPU load exceeded threshold"
        for i in ${spinup_disks//,/ }; do
            disk_status=$(mdcmd status | grep "rdevLastIO.${i}=" | cut -d '=' -f 2 | tr -d '\n')
            if [[ $disk_status == "0" ]]; then
                echo "Spin up disk ${i}"
                mdcmd spinup "$i"
            fi
        done
    fi
done

 

Explanation

- it requests the containers CPU load every ~2 seconds (answer time of "docker stats")

- if the load is higher than "cpu_treshold" (default is 1%) it checks the disks spinning status

- all sleeping "spinup_disks" will be spun up

 

Downside

- as long a Movie is running, all (unused) disks won't reach their sleep state (they spin down, but will be directly spun up again)

 

Monitoring

If you like to monitor the CPU load while (not) using Plex to find an optimal threshold value (or just for fun ;) ), open the WebTerminal and execute this (replace "1" against a threshold of your choice):

while true; do
    plex_cpu_load=$(docker stats --no-stream | grep -i plex | awk '{sub(/%/, "");print $3}')
    echo $plex_cpu_load
    if awk 'BEGIN {exit !('$plex_cpu_load' > 1)}'; then
        echo "Container's CPU load exceeded threshold"
    fi
done

On my machine Plex idles between 0.1 and 0.5% CPU load, why I choosed 1% as the default threshold.

Edited by mgutt
  • Like 2
Link to comment

Hi @mgutt

 

This is another great idea you have!

I think mostly my drives doesn't spin down before evening, but I have had complains late at night that the stream did not start

- and then you had to press play again? (Spin-up) (Not the end of the world but if I can fix it...then why not ;-))

 

So running this on Plex I get these spikes?

I checked and I have no TV-recording, no playback, and no scanning of library

What do you think could be causing these spikes?

 

image.png.0256fba114f38ea94a329a41b5b87772.png

Highest value was 45.97

 

From Plex

image.thumb.png.8130665d29eb8dbb7219d7a3ec3373e1.png

Link to comment

When this happens execute the following to see all Plex processes

 

 top -b -c -d 5 | grep -i -E "(top -|tasks:|%cpu|mem:|swap:|plex)"

 

I guess Plex:

- rescans the library

- Plex generates Thumbnails

- Plex tries to find tv show intros (so they can be skipped)

 

 

 

Edited by mgutt
Link to comment
22 hours ago, mgutt said:

When this happens execute the following to see all Plex processes

 

 top -b -c -d 5 | grep -i -E "(top -|tasks:|%cpu|mem:|swap:|plex)"

 

I guess Plex:

- rescans the library

- Plex generates Thumbnails

- Plex tries to find tv show intros (so they can be skipped)

 

 

 

So I did run both script at the same time:

image.png.53de2bc1085f02ff29c9ab85336f63bc.png

 

But below I don't see the CPU go above 1%

image.thumb.png.66c0273985078b6bf21f288a99370fc4.png

Not a single time!

This is really strange

 

 

Edited by casperse
Link to comment
1 minute ago, casperse said:

I don't see the CPU go above 1%

Really strange. Maybe the container CPU load includes I/O wait? Because your top results show permanent wait (1.4, 0.7, 0.5 and 1.0). This means something is reading / writing data from/to HDD/SSD and the volume answers not fast enough.

 

Please repeat the test, but this time check all processes:

top -c

I'll bet its unraid's SHFS process ^^

Link to comment
5 minutes ago, mgutt said:

Really strange. Maybe the container CPU load includes I/O wait? Because your top results show permanent wait (1.4, 0.7, 0.5 and 1.0). This means something is reading / writing data from/to HDD/SSD and the volume answers not fast enough.

 

Please repeat the test, but this time check all processes:


top -c

I'll bet its unraid's SHFS process ^^

image.thumb.png.4e34ac980fa63bd6f4a0ab861ecc84bf.png

 

I can't see what it is, hope you can :-)

Link to comment

 

7 hours ago, casperse said:

I can't see what it is, hope you can 🙂

I would say you have so much activity on your drives that even a small write from Plex like to a log file, can produce an I/O wait, which boosts the docker containers load.

 

If you like you could install "iotop" through the NerdPack Plugin and execute it:

iotop

 

It returns all processes with the most I/O actions.

 

And you could install sysstat to execute iostat which returns activity per disk:

watch -t -n 0.1 iostat -d -t -y 5 1

Finally you would need to find out which process is writing to the same disk, which is used by Plex. But I'm not sure if this effort is worth it. I mean what could you do to remove the io wait? Maybe direct disk access tweaking? Or a seperate disk for Plex alone? 

Edited by mgutt
Link to comment

Hmm yes I do have allot of things running on the server!

But I did manage to get individual Sleep of drives working by having separate UAD for download and other things (There is a nice FAQ about this setup in the forum) - and things I don't want to have on the array

And yes I have allot of apps running 24/7

 

I think the cache is the problem!

Plex and all appdata is located on the CACHE drive NVME drive and its the same drive that is used as cache creating the high I/O?

 

So I cant use Plex %CPU because everything is on my cache drive - But I still experience "slow" playback from Plex at strange hours because the other disk is in sleep state - I would need to find another trigger, to bad you cant set Plex user web activity = Spin-up 🙂 

 

 

 

Link to comment
20 hours ago, mgutt said:

Do you already use direct disk access (replace /mnt/user/... against /mnt/cache/...) for Plex appdata, docker.img and other docker appdata containers, using the cache only?

 

I see what you mean!

I only went into the dockers of Plex/Emby and changed the path to /cache/ I should do that for all of the running dockers /appdata/ path!

I come back and report my findings when that's done  ;)

Link to comment
1 hour ago, casperse said:

I only went into the dockers of Plex/Emby and changed the path to /cache/

Maybe this will solve it. Because it does not make sense to have I/O wait on an NVMe. An NVMe has a huge bandwith and allows (instead to SATA) parallel access. Or do you use a low-grade NVMe with QLC flash / non SLC Cache / no RAM?

Link to comment
2 minutes ago, mgutt said:

Maybe this will solve it. Because it does not make sense to have I/O wait on an NVMe. An NVMe has a huge bandwith and allows (instead to SATA) parallel access. Or do you use a low-grade NVMe with QLC flash / non SLC Cache / no RAM?

 

I got this one: Samsung SSD 970 EVO Plus 2TB the fastest I could get around a year ago! (Plex poster scroll loading :D)

So this should help I/O wait

Link to comment

So I did the change to all appdata path to /cache/

(If I ever update my cache drive I need to remember to change this back before moving it back to the array! LOL)

 

I then started to shutdown every docker! and every VM down! so only Plex docker was left running:

image.thumb.png.a6fcca1e3755e38bd144bcc07bf71cdd.png

 

And I keep getting these Bump as a heartbeat? So only thing left is Unraid Apps (Can not stop them, but I can start to uninstall them?)

Or its the Plex inc Docker? that somehow are different than the others? (I could try another one?)

Almost given up now! :-) 

 

Link to comment
33 minutes ago, mgutt said:

First process is SHFS. Something is still writing to / reading from "/mnt/user/".

 

1.) Your docker.img path uses /mnt/cache?

Yes I pretty much have /Cache/ written everywhere now :-)

image.png.bf05320959283447a321d729e51fc406.png 

33 minutes ago, mgutt said:

2.) Did you disable Heartbeat? (The Link contains a command to monitor writes to the docker.img as well)

Funny I said that it looked like a heartbeat didn't know there actually was one! (I will disable it when there is no traffic and test again)

Thanks!

Link to comment

Finally -  Thanks for all your help! - I think it's finally where I can use this script without spinning my drives up 24/7 :D

But maybe I should make it around 4-5% - I still get some spikes after 3 min:
3.71 - Container's CPU load exceeded threshold

And I think Plex activity of any user going into any media library will go way above 4%...even 10% in my testing

Could be interesting to see if this actual have any noticeable impact on my electric bill since all 22 drives (Not including the 2 x parity drives)  would be spin up during any playback of any movie. I actually did the separate UAD drive for frequent access in order not to do spin ups but that was 24/7 I think this is reasonable trade off... Thanks again

image.thumb.png.e004193b9666ab710d76dff06dee8915.png

 

Shouldn't it keep them spinning 2-3min between getting threshold again? keeping the drives spin-up?

Edited by casperse
Link to comment
52 minutes ago, casperse said:

all 22 drives (Not including the 2 x parity drives)  would be spin up during any playback of any movie

Maybe it's possible to monitor the traffic of the disks or use the Plex API to spindown all unused drives after several minutes instead of waiting for the usual time. But of course this would cause delay if someone starts s second movie from a different disk. What do you think?

55 minutes ago, casperse said:

Shouldn't it keep them spinning 2-3min between getting threshold again? keeping the drives spin-up

After they spin-up the usual timeout should keep them in this state or what do you mean?

Link to comment
1 hour ago, mgutt said:

Maybe it's possible to monitor the traffic of the disks or use the Plex API to spindown all unused drives after several minutes instead of waiting for the usual time. But of course this would cause delay if someone starts s second movie from a different disk. What do you think?

Since this only monitor the Plex service I think its ok to spin-up all drives up for a person accessing the system (Its a VIP service ;))

1 hour ago, mgutt said:

After they spin-up the usual timeout should keep them in this state or what do you mean?

I was wondering what the timeout of this "Spin-up state" was?

Since I can see accessing the Plex service and any library folder in Plex will draws +10% CPU every time!

Even just accessing my Tautulli app on my iphone would push the script to activate wich is ok and by design!

 

But what happens after you have accessed the library and you start to scroll through titles (They are all on the cache and loads to the browser cache) so the %CPU would go down again - how long time using "browsing titles" before it would spin down again?

 

I use your test script to monitor activity of the script (4%):

while true; do
    plex_cpu_load=$(docker stats --no-stream | grep -i plex | awk '{sub(/%/, "");print $3}')
    echo $plex_cpu_load
    if awk 'BEGIN {exit !('$plex_cpu_load' > 4)}'; then
        echo "Container's CPU load exceeded threshold"
    fi
done

 It's a great way to see how and when it triggers

Link to comment
18 minutes ago, casperse said:

But what happens after you have accessed the library and you start to scroll through titles (They are all on the cache and loads to the browser cache) so the %CPU would go down again - how long time using "browsing titles" before it would spin down again?

As long as your default spin down time in the Unraid settings.

Link to comment
3 minutes ago, trurl said:

No need to guess. The default is in Settings - Disk Settings

UPS! Thanks @trurlfor reminding me! - Just checked and after my last server migration I forgot I changed it back so it was set to 4 hours!

So back to 1 hour now (Which I think is the default value)

Edited by casperse
Link to comment

Join the conversation

You can post now and register later. If you have an account, sign in now to post with your account.
Note: Your post will require moderator approval before it will be visible.

Guest
Reply to this topic...

×   Pasted as rich text.   Restore formatting

  Only 75 emoji are allowed.

×   Your link has been automatically embedded.   Display as a link instead

×   Your previous content has been restored.   Clear editor

×   You cannot paste images directly. Upload or insert images from URL.