Recovering from Highwater splitting


NAS

Recommended Posts

Everyone knows, and much discussion has been put into, designing a highwater enhancement that reduces the splitting problem.

 

In summary though over time on a many drive system that is, or has at one point, come close to be full you will find that folders you dont want to be split become massively split. This is akin to file fragmenting but on multiple files that ideally would be on the same disk. Why?... well because if you watching say a tv show you dont want to have 6 disks spun up just so you can flick through a series.

 

I have created a simple tool to help recover from this situation by showing you drive allocation per folder. It is by no means perfect and this thread is as much to do with enhacing it as it is about anything else. here it is:

 

#!/usr/bin/bash

VALUES=($(mount | awk '/disk[0-9]/ {print $3}'))
FOLDER=`pwd | sed 's/\/mnt\/user\(\/.*\)/\1/'`

for ((i=0; i<${#VALUES[@]}; i++))
do

if [ -d "${VALUES[$i]}$FOLDER" ]
then
    du -sh "${VALUES[$i]}$FOLDER";
fi

done

 

I have called this undist.sh.

 

How it works is if you go to a user share and call the tool it will tell you how much of this folder lives on each drive.

 

Example:

 

root@TOWER[TV - W] 08:34 AM>pwd
/mnt/user/TV/TV - W
root@TOWER[TV - W] 08:34 AM>/boot/scripts/undist.sh
50G     /mnt/disk1/TV/TV - W
706M    /mnt/disk13/TV/TV - W
352M    /mnt/disk9/TV/TV - W
17G     /mnt/disk2/TV/TV - W
25G     /mnt/disk12/TV/TV - W
28G     /mnt/disk11/TV/TV - W
39G     /mnt/disk10/TV/TV - W
7.7G    /mnt/disk4/TV/TV - W
60G     /mnt/disk14/TV/TV - W
2.8G    /mnt/disk15/TV/TV - W
12K     /mnt/disk5/TV/TV - W

 

You can see that unRAID has split this folder all over the place and in fact if i was to look through this entire folder 11 drives would spin up !

 

Next enhancement should really be to show free space on each disk but i cant see immediately how to go from diskX to mount point Y. Suggestions on a post card :)

Link to comment

I've been keeping an eye on disk allocation as my array fills up.  I upload TV series as you do, and some of the directories are 50+ GBs in size.  If your split level is not set correctly, it will scatter those files all over the place as you indicated.

 

What I'd really like to see is user defined high water.  I would set mine to 90% - that is fill a disk to 90% before rolling over to the next disk.

Link to comment

I've been keeping an eye on disk allocation as my array fills up.  I upload TV series as you do, and some of the directories are 50+ GBs in size.  If your split level is not set correctly, it will scatter those files all over the place as you indicated.

 

What I'd really like to see is user defined high water.  I would set mine to 90% - that is fill a disk to 90% before rolling over to the next disk.

 

This is EXACTLY what i would like to see also.  I have unRaid Pro and like the cache drive but as it stands now it scatters stuff all over the place.  I would love to specify my own mark for when stuff should be written to another drive.  I have mine set right now so that the cache drive is used but only one disk is in the included disks list.  And from there I just need to pay attention to how full that disk is getting.  After it fills to a certain limit i set that disk as excluded and a different one to included.  This makes i work the way i would like but is kind of tedious.  If i could just specify a highwater mark and set the included disks as i like and then let the mover script do its thing that would be great.

Link to comment
  • 2 months later...

Again i point out this is scrappy. However heres a better version:

 

#!/usr/bin/bash

#Get a list of drives in use
VALUES=($(mount | awk '/disk[0-9]/ {print $3}'))

#Get the intersting bit of the foldeer this tool was called from
FOLDER=`pwd | sed 's/\/mnt\/user\(\/.*\)/\1/'`
echo "Folder: ${FOLDER}";

#Loop through the values of the drive that are installed
for ((i=0; i<${#VALUES[@]}; i++))
do

if [ -d "${VALUES[$i]}$FOLDER" ]
then
    #Get the size on disk of the folder but on one of the disks
    fld=($(du -sh "${VALUES[$i]}$FOLDER"));
    #Get the space free on this same disk
    dsk=($(df -h "${VALUES[$i]}" | sed -n '2p'|awk '{print $4}'));
    echo "Disk:${VALUES[$i]} Using:${fld} Free:${dsk}";

fi
done

 

 

Example of usage:

 

root@TOWER [~] 10:22 AM>pwd

/root

root@TOWER [~] 10:22 AM>cd /mnt/user/Movies/

root@TOWER [Movies] 10:22 AM>/boot/scripts/undist.sh

Folder: /Movies

Disk:/mnt/disk7 Using:169G Free:115G

Disk:/mnt/disk8 Using:121G Free:149G

Disk:/mnt/disk3 Using:431G Free:25G

Disk:/mnt/disk15 Using:250G Free:41G

 

 

So i navigated to the fodler i want to see the split on. Ran the script and it told me the Movies folder lives on 4 drives, how much of it is on each drive and how much space each drive has.

 

 

 

Link to comment

Got you now.

 

You can easily use my script above and mc to recover. It is not quick but its easy to do.

 

An auto script would be very very tricky.... because not only would it need to take into account the next merge it would really need to play wargames to see what is the most economic moves are i..e if I move folder 1 to disk 2 that fees up space on disk one that will then allow folder 3 to move which in turn creates some space so if i move folder 4 etc etc etc

 

A tool that basicalt can be run over and over and only has a simple logic is far less efficient but not outwith the realms of possibilty. The log would be:

 

User please pick a folder.

Script, this folder is on these drives using this space.

Starting with the drive with the most content on it move files in from the druve with the least content until full.

Loop

 

 

Thats doable but not by me as i cannot find a tool that can merge folders reliably other than mc and i cant see how this would be scripted

 

Link to comment
  • 1 year later...

Help.  :o

 

I need to take this tool to the next level and I could do with opinions on what direction to take.

 

The tools as it stands will tell you the split amounts per directory one at a time.

 

However in a perfect world I think we could all benefit from a tool that could tell you where your splits could be optimised.

 

For instance if I have a folder A with 35GB of data on one drive and a folder A on another with 1MB of data that is not an optimal situation.

 

I have found that on my HD box i have virtually no movie with the mkv art and nfo on the same drive. So thats something like 15GB split up 14.99GB on one drive and 0.01GB on two others.

 

We could do with a means of crating a report of where plits could be optimised.

 

Thoughts.

Link to comment
  • 4 weeks later...

Heh it's funny that I just noticed this thread because it's 2 days after finally taking the time to clean up my highwater mess from my earlier unRaid days.  Interesting idea for a script NAS, but that just sounds so dang complicated for such a nonsensical problem.

 

I ended up using a Windows based file/folder merging utility called WinMerge to compare disk 1/films to disk 2/films (for instance), it would list which folders were on the left (disk 1) which were on the right (disk 2) and then any folders shared between both drives would be listed as blank.  All I did then was select all shared folders (reported by the program) on disk 1 and Teracopy over to disk 2, if it began transferring a large movie file I'd skip it (which I would do every time a large file/movie began transferring).  Once that step finished, I would do the same from disk 2 to disk 1.  If all went well and I skipped everything properly, both drives should now be nicely cleaned up.  From there I did disk 1 to disk 3, and so on.  It didn't take much time, maybe an hour at the most for all 8 disks I went through.  It got exponentially easier as I cleared out each drive since there was less to clean up as I went.  The longest/hardest part was devising that initial process which worked wonderfully... other than the need for a Windows based computer for launching the process (I'm sure you can find something in another flavor to do the same or hell writing your script to do the same).

 

Now I can safely use a split level of 0 without worry of it further spreading files between 2 or more directories.

Link to comment
  • 3 years later...

What I'd really like to see is user defined high water.  I would set mine to 90% - that is fill a disk to 90% before rolling over to the next disk.

 

Just set the allocation to "Fill Up"  :)    No reason to stop at 90% ... but if you want it to stop at 90%, you can set allocation to "Fill-Up" and set the minimum free space to a value equal to 10% of the drive's capacity.

 

 

I guess this is why I've always written to disk shares opposed to user shares so I know exactly where I'm putting files and I can control the levels.

 

Manually writing to the individual disks is, of course, an absolute way to control your allocation  8)

Link to comment

Ehm...

 

It should not matter if files are scattered.. however I do understand the feeling and need to "tidy up"..

 

To avoid copying things hence and forth and becomming an MC pilot:

 

Lets say you have a share "TV" that is scattered like this, why not create a new share (lets call it "SERIES"), set up split level the way you want it and then just set the system to copy from TV to SERIES.. It will get all the files from all the places and move them over to the new share where split level is the way you want it..

 

 

 

Personally I have also chosen to change my setup from "just put it anywhere and be very drive space effient" towards:

 

- every folder under MOVIES can be only on one disk (so every movie is kept together)

- every folder under SERIES\<NAME> can be only on one disk (so every season of a show is kept together)

 

And then I setup specific includes for each share and adding drives if disks fill up, that keeps stuff together without getting to manual a process..

 

Link to comment

Agree it doesn't make a lot of difference.  There are many ways to organize the various shares ... and UnRAID will handle it no matter how you choose to allocate the files among various disks.    Clearly it's best if a single movie/show is only on one disk ... so if you're storing DVDS in the original VOB structure you want to use a split level that won't allow a DVD to spread across disks;  but if all your media is in container formats so there's one file/movie or show then that's automatically taken care of.

 

If you want a share to be restricted to one disk as long as there's room, the approach Helmonder noted works very well -- just use Includes to restrict each share to a dedicated disk until it's full; then add another disk to the list of included disks as needed.  Or include all the disks you want to dedicate to a share and set the allocation to "Fill-Up" ... which will then only use one of the included disks at a time.

 

Link to comment

I know this is a old thread - but -

 

did anything come to conclusion on some type to script to clean up the disks?

 

Myk

 

Obviously you can change your folder structure to match an ideal for unRAID highwater level and you can then fettle with the settings all day long and probably get quite close to what you need.

 

However this comes at the cost of ease of configuration and artificially running out of disk space.

 

ideally unRAID would have a setting to "prefer" a disk. This would be similar in nature to included disks but not as hard and fast i.e. use this disk until you can and then just revert to normal.

 

But unRAID does not have the feature and possibly never will.

 

So in light of that fact you can run this script once in a while and use the info it gives along with mc to do much the same. I find myself doing it when i start upgrading disks e.g replacing 2TB with 4TB etc

Link to comment
  • 7 months later...

NAS

 

When I try your script I get the following error.

 

root@Tower:/mnt/user/Movies HD# /boot/shellscript/undist.sh

Folder: /Movies HD

/boot/shellscript/undist.sh: line 17: syntax error near unexpected token `('

/boot/shellscript/undist.sh: line 17: `    fld=($(du -sh "${VALUES[$i]}$FOLDER"));'

 

 

I'm using the this copy of your script.

 

#!/usr/bin/bash

#Get a list of drives in use
VALUES=($(mount | awk '/disk[0-9]/ {print $3}'))

#Get the interesting bit of the folder this tool was called from
FOLDER=`pwd | sed 's/\/mnt\/user\(\/.*\)/\1/'`
echo "Folder: ${FOLDER}";

#Loop through the values of the drive that are installed
for ((i=0; i<${#VALUES[@]}; i++))
do

if [ -d "${VALUES[$i]}$FOLDER" ]
then
    #Get the size on disk of the folder but on one of the disks
    fld=($(du -sh "${VALUES[$i]}$FOLDER"));
    #Get the space free on this same disk
    dsk=($(df -h "${VALUES[$i]}" | sed -n '2p'|awk '{print $4}'));
    echo "Disk:${VALUES[$i]} Using:${fld} Free:${dsk}";

fi
done

 

Solved. indents had weird character in them.

 

Kevin.

Link to comment

Join the conversation

You can post now and register later. If you have an account, sign in now to post with your account.
Note: Your post will require moderator approval before it will be visible.

Guest
Reply to this topic...

×   Pasted as rich text.   Restore formatting

  Only 75 emoji are allowed.

×   Your link has been automatically embedded.   Display as a link instead

×   Your previous content has been restored.   Clear editor

×   You cannot paste images directly. Upload or insert images from URL.