new split level


Recommended Posts

Hi,

what you guys think about a new split level?

 

-Put all files, if possible (size)(which are in the same directory) on same HDD-

 

?

 

 

This would stop this nonsense about split level (i know every one of you understand it, but i dont know if its because of english or im just too stupid, also if i search for it, im not the only one with this problem) I guess this new split level would work in more then 90% of all scenarios (except you dont use seperate folders for e.g. movies)

 

--- edit --- 

My main suggestion is that the split level works like now, BUT it tries on a PER DIRECTORY base keep files which are in the same directory together. BUT NOT DIRECTORYS.

 

See this advanced view on my idea:

before (what split level now does, if u dont set it, or set it wrong):

/some/share/films/film1

-> file1 (hdd1)

-> file2 (hdd2)

-> file3 (hdd3)

 

/some/share/films/idc1

-> file1 (hdd1)

-> file2 (hdd2)

-> file3 (hdd3)

 

/some/share/films/idc1/something/ishere

-> file1 (hdd3)

-> file2 (hdd2)

-> file3 (hdd1)

 

/some/share/series/serie/serie s01/film1

-> file1 (hdd1)

-> file2 (hdd2)

-> file3 (hdd2)

 

after (what my new "zero intelligence needed" smart split level could do:

/some/share/films/film1

-> file1 (hdd1)

-> file2 (hdd1)

-> file3 (hdd1)

 

/some/share/films/idc1

-> file1 (hdd3)

-> file2 (hdd3)

-> file3 (hdd3)

 

/some/share/films/idc1/something/ishere

-> file1 (hdd5)

-> file2 (hdd5)

-> file3 (hdd5)

 

/some/share/series/serie/serie s01/film1

-> file1 (hdd2)

-> file2 (hdd2)

-> file3 (hdd2)

 

Thoughts against it the community came up so far:

 

-Would use too much Disk, RAM, CPU.

Im no technical, but in my brain its the same resource, or not noticable difference to just check on which hdd all other files in one folder are.

 

-If you have directory with many many and big files, which are in total bigger then hdd X

Thats a problem and can be solved by just falling back to standard "put wherever space is". (nice to have would be see bottom)

 

Why i think its an good idea?

I think this option to keep files which are in the same directory, but not directorys itself, is an option for standard clueless noobs and in most situations i think it will improve speed, relialble and disk lifespan with no additional negatives. 

 

 

As EXTRA, "nice to have", would be if the mover also tries to accomplish this setting while moving anyway. Like "oh theres only 10GB left on HDD1", and there is one folder with many big files, "maybe i should move it to the empty HDD 2".


OR

 

"last time i couldnt fit file234 in directory X because that hdd hadnt enaught space, but now there is space for it, so move it back"

 

OR

 

"I could move these 891 files to that hdd" and then the 1 big movie file would fit in the correct HDD.

 

 

 

 

 

Edited by nuhll
Link to comment

I even got an more advanced idea about this.

 

My first suggestion would be probably better then its now where u have to set it specific for every share and count the directorys and such things, all this could be safed.

 

But perfect would be, if UNRAID just LEARN what files are accessed together and (when in idle) throw them once a month back and forth (move them from HDD X to HDD XY), so that drives fill up like set BUT also, if possible, files which are accessed together be on the same HDD.

 

That would be clever. Im pretty sure, with someone more clever then me thinking about it, it could be "easily" done.

 

Quote

1.) user select "smart split" in share

2.) (thats the easiest way i could think, there are OFC much more effienct better ways of doin this!!!) unraid starts logging accesses together with time

 

2134123123 (time) hdd1/this/ismy/share/somefile.OK

2134123124 (time) hdd2/this/ismy/share/somefile1.OK

2134123125 (time) hdd3/this/ismy/share2/somefile2.OK

2134323123 (time) hdd4/this/ismy/share3/somefile3.OK

2134323123 (time) hdd5/this/ismy/share4/somefile4.OK

2134623123 (time) hdd1/this/ismy/share5/somefile5.OK

2134623123 (time) hdd2/this/ismy/share6/somefile6.OK

2134723123 (time) hdd3/this/ismy/share7/somefile7.OK

 

3.) Once a month, or week, or what ever. A process works on this log entries and find files which are accessed X time together and looks if its on same HDD,

 

4.) if not, it creates a quene to move it to an other hdd.

 

In my e.g. it would move somefile1, somefile2 and somefile3 together on (which hdd ever is best if you look at space used). 

 

This would hit performance bc u need to log all accesses, but i guess, its worth not fiddling around with the split levels :P 

Edited by nuhll
Link to comment
On 4/24/2018 at 4:19 PM, nuhll said:

-Put all files, if possible (size)(which are in the same directory) on same HDD-

 

Split level does do this. You get to decide which level of the directory structure to keep on the same disk.

 

If you mean keep all files from a specific user share on the same disk, you can do this by having the user share include only one disk. A user share is just the top level folders on all disks named for the user share.

 

If instead you mean the "top level folder" within a user share then that is the 1st split level.

 

So this seems like a feature that already exists. Maybe you have something else in mind but I have failed to understand what it is.

Link to comment
1 minute ago, nuhll said:

I know that split level does this when u configure it correct.

 

BUT MY SUGGESTION IS TO MAKE THIS AN AUTOMATIC OPTION, SO YOU DONT NEED TO SELECT at which point to split.

 

Is my english that bad? :(

 

So what split level would you like for the default?

Link to comment

????

 

I want a new split level called "Smart split"

 

Which automaticly

 

-Put all files, if possible (size)(which are in the same directory) on same HDD-

 

OR

 

the other way around i descriped later, which would be more work. But i guess this easy change would be okay already for most users.

 

Is my english really that bad? :(

Edited by nuhll
Link to comment
6 minutes ago, nuhll said:

Put all files, if possible (size)(which are in the same directory) on same HDD-

 

Please specify exactly which directory you mean when you say "same directory". Is it a top directory within a user share? Something else? What exactly is the "smarts" supposed to use to decide what is the smart thing to do?

Link to comment

Okay.

 

Lets say you have

 

a user share for all your files and you dont have a split level set (so its putting all files random across whole array).

 

The smart thing would be that it just places all files which are in the same directory on the same HDD in the array. While tryin to make us of the allocation method specified.

 

In this mode ofc your directorys may not be bigger then any of the choosen hdds.

 

before:

/some/share/films/film1

-> file1 (hdd1)

-> file2 (hdd2)

-> file3 (hdd3)

 

/some/share/series/serie/serie s01/film1

-> file1 (hdd1)

-> file2 (hdd2)

-> file3 (hdd3)

 

after:

/some/share/films/film1

-> file1 (hdd1)

-> file2 (hdd1)

-> file3 (hdd1)

 

/some/share/series/serie/serie s01/film1

-> file1 (hdd2)

-> file2 (hdd2)

-> file3 (hdd2)

 

It wont be only helpfull for ppl like me which doesnt really understand how to set split level correct manual, but also it helps ppl like me which use shares where one split level is not enaught. (in my case, i dont think i could find an usefull split level for my archive share - i would need to change my directorys according to split level)

 

Edited by nuhll
Link to comment

For any allocation method other than Most Free (which I don't know of any good reason to use and should be banished IMO), most of the time these files will wind up together anyway.

 

But, lets go ahead and add some more details to this. Here is one of your "after" examples:

 

49 minutes ago, nuhll said:

after:

/some/share/films/film1

-> file1 (hdd1)

-> file2 (hdd1)

-> file3 (hdd1)

 

What should happen if sometime later a subdirectory gets created in /some/share/films/film1? Can the files in that subdirectory go to another disk? If you could keep all the files in that subdirectory together on a different disk would that be OK?

 

A similar but simpler example. You create a share named exampleshare and it has no subdirectories, only files. What should happen if later you create subdirectories in that share?

 

It seems like the logic involved to make anything like this happen would have to impact performance when writing files, figuring out which disk other files in the subdirectory were already on, etc.

 

Link to comment
2 minutes ago, trurl said:

For any allocation method other than Most Free (which I don't know of any good reason to use and should be banished IMO), most of the time these files will wind up together anyway.

 

But, lets go ahead and add some more details to this. Here is one of your "after" examples:

 

 

1. What should happen if sometime later a subdirectory gets created in /some/share/films/film1? Can the files in that subdirectory go to another disk?

 

2. If you could keep all the files in that subdirectory together on a different disk would that be OK?

 

3. A similar but simpler example. You create a share named exampleshare and it has no subdirectories, only files. What should happen if later you create subdirectories in that share?

 

It seems like the logic involved to make anything like this happen would have to impact performance when writing files, figuring out which disk other files in the subdirectory were already on, etc.

 

 

1. yes, ofc. its a per directory split rule.

2. yes (its the same question, or?)

3. u create a share without directorys, so unraid can try to keep ROOT together on one hdd as long as possible, if its able to keep allocation methode up.

 

All these actions (if they really are performance impact, which i dont believe, but i could be wrong) could be done when mover is invoked. (and or if system is idle, or at specific times)

Link to comment

Except for mover, unRAID never moves anything so the part about

 

10 minutes ago, nuhll said:

if system is idle, or at specific times

 

doesn't apply.

 

30 minutes ago, nuhll said:

if they really are performance impact, which i dont believe

 

The user share settings are used to choose an array disk when moving by mover, just as they are used to choose an array disk when writing to any user share which isn't cached. As currently implemented, choosing a disk to write according to the user share settings doesn't require looking at where any other files are currently stored. I'm pretty sure everything it needs to know is already in RAM.

 

So it would likely be a big performance impact since it would have to read disks (maybe all of them would have to be spun up) in order to know about other files when it is already reading and writing disks in order to maintain parity.

 

Link to comment

Why should it need to read disk? When what you say is true, that it is in ram already... you got the point...

 

To decide where to put (which hdd) he also needs to know which dir it goes?!


Overall it doesnt matter, how much cpu, ram, disk it needs to look while he is anyway moving and accessing the data it takes to write files which are in same directory to the correct disk?

 

If someone need a high speed, high capacy, ultra avaible server, he would probably wont take unraid and also he could just set the split level like he want and dont need to use the new "smart split" option. I guess all other standard users would be happy by not spinning up useless many disks.

 

I always feel like ppl dont want to make software better, or cant think how other ppl might use their unraid instance.

 

Since you have more knowledge about unraid, YOU could tell me / us how it is technicaly done best. I just make suggestions which, i think, would help the standard user much. And thats where u make money. YOu can make a product specific for a niche, but unraid is obviolous targeting standard users more and more, which is the right way. This new option is one of many things which could make unraid better useable for "noobs".

Edited by nuhll
Link to comment

I think it is nowhere near as easy as you seem to think to define an algorithm that covers all Use Cases.   For instance you do not cover the case where a given directory is too large to fit on a single disk or where it fits initially but then outgrows the disk.   You need to remember that unRAID has no facility that moves files to another location after they have been written to the array.   Unless a robust algorithm can be defined then it is not practical to implement anything like you seem to be suggesting.

Link to comment

Did u even read what i wrote? This makes it very hard to make suggestions, because you get the feeling no one cares...

 

 

"  For instance you do not cover the case where a given directory is too large to fit on a single disk or where it fits initially but then outgrows the disk. "

I suggested: if possible to fit on one disk (so if not, then leave it where it is) - its that easy. OR, the advanced way, search for a hdd where it fits.

 

Also im not talking about this is usefull for every case, i said for the standard user, which has a "standard directory arrangement" it would be anice addition.

Edited by nuhll
Link to comment
4 hours ago, nuhll said:

It wont be only helpfull for ppl like me which doesnt really understand how to set split level correct

lol  You're not alone

 

 

I struggled with split levels when I first started, and for my use case came up with the following:

  • I don't particularly care where any particular media file(s) are stored
  • All of my video media is always a single mkv, so I'm not concerned about another drive having to spin up in the middle of the movie to continue watching it.
  • Why would I care on which drive a .nfo file is stored on with relation to the .mkv
  • If I'm binge watching a TV show, does it really bother me if another drive has to spin up when I'm finished watching episode 2 and want to watch episode 3?  Wouldn't be the first 5 seconds of my life I've lost, and more than likely won't be the last.
  • I chose unRaid originally for several reasons.  But the most important one was that I would rather in a worst case scenario and have multiple drive failures (where I've exceeded the redundancy limits) lose some of the files and not all of them.  For this reason, I want my backups and my TV series et al spread out over various disks.  (Years and years ago (pre-unRaid) I did manage to lose all of our pictures - trust me, you're better off losing some pictures than losing all of them.  The nicest wife can be very nasty in that situation)
  • I would prefer that in the situation of multiple users all watching the same series (but different episodes) having those episodes stored on different drives to avoid drive bandwidth issues, and head thrashing.

Hence, I don't bother with split levels at all for any share.  The notable exception however is that I do confine my music to a single drive to avoid spinup pauses.  (There is however offsite backups of that share)

 

Your use case may be different.  My personal opinion is that users tend to get hung up on split levels when it's not really important at all.

 

  • Upvote 1
Link to comment

Yes, thats perfect. Thanks.

 

You are right, this feature request is not suited for all cases or all people (i guess, the advanced ppl still use their split level).

 

And you are right, it doesnt matter if that nfo... and so on, BUT i would like to get the max out of the system (long time without hardware failures) so i would like to only spin up whats needed, also i want to safe power. Also spin up and spin down (like goin from e01 to e02) is, as far as i know, also a factor which "uses" ur hdd.

 

If today a drive failures and you have parity (like i have) its not likly that you loose anything because you can recover.

 

Also that music point is the most anoyying thing for me.

 

But there are other usecases, like if i would like to browse the archiv, because windows gathers informations about the files and the directorys, all drives spin up.

 

In the end, ppl who dotn want to mess with split levels could choose "smart split". Advanced ppl just still set their split levels like they want it.

 

Besides the one directory is too big for one hdd thing i dont see any downside in my suggestion.

 


I see it as a quality of life feature for standard noobs. (most time correct split level automatic set to get most out of unraid)

Edited by nuhll
Link to comment

I still don't understand what smart-split done, how it work and different with current splitting or unbalance, we discuss what ?

 

Some member also pointout why it wont work, but OP just ignore.

 

Any directory moving will change the timestamp. How smart-split will overcome this ?

Edited by Benson
Link to comment

How split setting overcome this now???

 

Sorry but easier explaining as here i cant, sorry.

before:

/some/share/films/film1

-> file1 (hdd1)

-> file2 (hdd2)

-> file3 (hdd3)

 

/some/share/series/serie/serie s01/film1

-> file1 (hdd1)

-> file2 (hdd2)

-> file3 (hdd3)

 

after:

/some/share/films/film1

-> file1 (hdd1)

-> file2 (hdd1)

-> file3 (hdd1)

 

/some/share/series/serie/serie s01/film1

-> file1 (hdd2)

-> file2 (hdd2)

-> file3 (hdd2)

 


It should be done like its now working with split level (the same thing, just differnt RULES), if it AFTERWARDS move it, is an additional feature (nice to have).

 

Edited by nuhll
Link to comment
8 minutes ago, nuhll said:

But there are other usecases, like if i would like to browse the archiv, because windows gathers informations about the files and the directorys, all drives spin up.

Assuming that you have Dynamix Folder Cache Dirs installed, to the extra command line options add

-p 1

 

Keeps the entries cached unless its absolutely necessary to get rid of them.  (It defaults to 10, but even at that level, its too easy IMHO for linux to drop the entries when today's apps will allocate as much memory as possible) - YMMV

 

Makes a world of difference if you have tons and tons of files / folders.

4 minutes ago, Benson said:

Any directory moving will change the timestamp.

Lots of easy ways around that.

Edited by Squid
Link to comment

That, also.

3 minutes ago, Squid said:

Makes a world of difference if you have tons and tons of files / folders.

Lots of easy ways around that.

 

I also use cache dirs, but like the author says, it cant always be good enaught.

 

There are cases where its not needed or suited, but i think in most cases it will atleast not HURT.

Edited by nuhll
Link to comment

Join the conversation

You can post now and register later. If you have an account, sign in now to post with your account.
Note: Your post will require moderator approval before it will be visible.

Guest
Reply to this topic...

×   Pasted as rich text.   Restore formatting

  Only 75 emoji are allowed.

×   Your link has been automatically embedded.   Display as a link instead

×   Your previous content has been restored.   Clear editor

×   You cannot paste images directly. Upload or insert images from URL.