Any convention on including different disks for different tasks?


PzrrL

Recommended Posts

I have the 12 disks(10 HDD+2SSD) which can be used for different tasks. What I mean task is what I would like to achieve.

Tasks as below:

- VM

- bittorrent

- Docker

- appdata (I guess this is the Docker's appdata)

- Personal Windows 10 backup

- Unraid USB backup

- Time Machine for Mac

- Movies

- Photos

- Documents

- Minecraft Server (or any other server eg Garry's Mod, L4D2...)

 

So I am trying to assign different disks for different tasks (and shares), so that different task are using different disks to avoid scratching the disk too much. I would like to ask if this assignment is appropriate or any recommendations?

0. 2 SSDs for cache drives

1. 2 largest disks for parity

2. 1 disk for BitTorrenting (as scratch disk, will use unassigned devices plugin and not putting this disk in the array) (Cache: No)

3. 1 disk for Windows backup, USB backup, appdata backup, Time Machine (Cache: No)

4. 1 disk for Minecraft Server (Cache: Yes?Prefer?)

5. 1 disk for VM (Cache: Not sure)

6. 1 disk for Movies (Cache: Yes, for faster read?)

7. 1 disk for Photos

8. 1 disk for Home document sharing across people

 

And I guess 6-8 can actually combine as 1 share?

 

I am not sure if this is the appropriate way to balance the load of my tasks, and please advise on which share should I use cache, thanks!

 

Sorry if I am not asking my question in a organized way lol

 

Edited by PzrrL
Edit title
Link to comment

Your number 2 and 6 show you don't quite understand the Unraid cache settings.

Cache = Yes speeds up write but it won't speed up read (except the data that still to be moved to the array).

Unassigned device by default does not use the cache pool - "unassigned" = not in the cache pool or array.

 

Having a separate disk for a separate task is kinda defeating the purpose of having an array.

Unless a task NEED to be on a single drive / SSD cache, you should spread them out across multiple drives on the array.

That will maximize the utilisation of your drives.

Then you use the Share functionality to split them into tasks.

 

What NEED to be on a single drive? Basically anything that would be crippled by low write speed and/or receive frequent but low-priority IO.

What NEED to be on the SSD cache? Basically anything for which responsiveness is critical. (i.e. random IO).

 

So it should be something like this:

  • VM: cache = only
  • bittorrent: unassigned (or cache = only if data is not too large)
  • Docker: cache = only
  • appdata (I guess this is the Docker's appdata): cache = only
  • Personal Windows 10 backup: cache = no
  • Unraid USB backup: cache = no
  • Time Machine for Mac: cache = no
  • Movies: cache = no
  • Photos: cache = no
  • Documents: cache = no (or cache = prefer if data is not too large)
  • Minecraft Server (or any other server eg Garry's Mod, L4D2...): don't know

 

You will notice that I don't have anything Cache = Yes.

For an array with reconstruct write (aka turbo write) = on and modern HDD, the write speed can quite often exceed gigabit (125MB/s). In addition, Unraid (and Linux in general) will automatically uses free RAM to cache write first i.e. short bursts of write will be super fast regardless of other settings.

That means for most home users, the benefit of using the cache pool for write cache is rather limited, which usually doesn't justify reducing the lifespan of the more expensive SSD's.

 

Cache = Prefer serves as a fail safe to move stuff off the cache pool into the array to reduce your chance its filling up.

However, as I mentioned above, usually you don't need to have shares with Cache = Yes so the chance of filling up the cache pool (which usually happens due to Cache = Yes shares) would not be that high.

In fact, things like docker image, and VM vdisks and appdata should not / cannot be moved while being used. But then beside those 3, there aren't that much more stuff that NEED to be on the cache pool anyway.

Edited by testdasi
  • Like 1
Link to comment

@testdasi  Thanks for your detailed answer!

 

13 minutes ago, testdasi said:

Having a separate disk for a separate task is kinda defeating the purpose of having an array

Or maybe put it into another way, I am going to use a group of disks to achieve a task. Will this make more sense to you?

 

13 minutes ago, testdasi said:

What NEED to be on a single drive? Basically anything that would be crippled by low write speed and/or receive frequent but low-priority IO.

What NEED to be on the SSD cache? Basically anything for which responsiveness is critical. (i.e. random IO).

 

Can you please give some examples for things that need to be on a single drive?

 

I don't understand that why VM should be cache: only. Let's say I am shutting down the Unraid, where is the VM going to be saved? If it still persist on the cache drive, then I assume that it loses the protection by parity drive on data array. Please correct me if I am wrong.

 

Moreover, I read some posts suggesting that I should use cache: prefer for VMs and Dockers, so what's the difference between cache:only and cache: prefer for this kind of tasks? I know I am so confused on the cache drive part although I read a lot of info about it already so please forgive me..

Edited by PzrrL
Link to comment
1 hour ago, PzrrL said:

Or maybe put it into another way, I am going to use a group of disks to achieve a task. Will this make more sense to you?

 

Can you please give some examples for things that need to be on a single drive?

 

I don't understand that why VM should be cache: only. Let's say I am shutting down the Unraid, where is the VM going to be saved? If it still persist on the cache drive, then I assume that it loses the protection by parity drive on data array. Please correct me if I am wrong.

 

Moreover, I read some posts suggesting that I should use cache: prefer for VMs and Dockers, so what's the difference between cache:only and cache: prefer for this kind of tasks? I know I am so confused on the cache drive part although I read a lot of info about it already so please forgive me..

You are really overthinking it. But if you need to have a task to access only a group disks in your array, you can use the Include functionality of the Share to just include only those disks for the share.

 

An example of things that need to be on a single drive is your download temp i.e. things that need to be processed further. No parity protection is required and the write-heavy nature means low write speed will cripple it (and in fact it will other activities as well because of high IO Wait causing lag).

 

You said you have 2 SSD, which we would assume to be running in a RAID-1 i.e. mirror. All other things being equaled, mirror beats parity any day.

Morover, critical data such as your VM vdisk should be backed up. Parity protection is not a replacement for a backup.

 

Cache = only: data always on cache

Cache = prefer: data is on cache. If cache full then mover moves data to array. Once cache is freed up then mover moves data back.

 

I subscribe to Occam's razor i.e. the simpler solution is usually the right one.

I have found cache = prefer to cause troubles e.g. with plex appdata db being half-array-half-cache so I would rather set things up such that I won't unexpectedly run out of space rather than relying on the mover to "solve" it.

Link to comment
7 minutes ago, testdasi said:

Cache = prefer: data is on cache. If cache full then mover moves data to array. Once cache is freed up then mover moves data back.

This is not exactly how it works. Cache-prefer overflows new writes to the array if cache doesn't have Minimum Free (in Global Share Settings). Mover never moves cache-prefer to the array, it only moves from array to cache when cache has room.

Link to comment
3 minutes ago, testdasi said:

An example of things that need to be on a single drive is your download temp

That would make sense. Therefore it is right for me to put Bittorrent disk in Unassigned, right?

 

4 minutes ago, testdasi said:

You said you have 2 SSD, which we would assume to be running in a RAID-1 i.e. mirror

Yes I will be running in mirror for SSD.

 

5 minutes ago, testdasi said:

Morover, critical data such as your VM vdisk should be backed up. Parity protection is not a replacement for a backup

That means I am going to backup the VM vdisk from time to time from my cache disk to my data array, is that correct?

 

8 minutes ago, testdasi said:

I have found cache = prefer to cause troubles e.g. with plex appdata db being half-array-half-cache so I would rather set things up such that I won't unexpectedly run out of space rather than relying on the mover to "solve" it.

I haven't tried plex yet, may you briefing describe how do you "set things up"? Is it sth related to the setting in plex or what?

 

Sorry for asking so many questions, many thanks!!!

 

 

Link to comment
1 minute ago, trurl said:

This is not exactly how it works. Cache-prefer overflows new writes to the array if cache doesn't have Minimum Free (in Global Share Settings). Mover never moves cache-prefer to the array, it only moves from array to cache when cache has room.

So when exactly will the overflowed writes write back to cache? By invoking mover? Is this auto or manual?

Link to comment
1 minute ago, PzrrL said:

That would make sense. Therefore it is right for me to put Bittorrent disk in Unassigned, right?

 

That means I am going to backup the VM vdisk from time to time from my cache disk to my data array, is that correct?

 

I haven't tried plex yet, may you briefing describe how do you "set things up"? Is it sth related to the setting in plex or what?

Yes

Yes

Nothing special. I put Plex appdata in a share with cache = only.

 

 

2 minutes ago, trurl said:

This is not exactly how it works. Cache-prefer overflows new writes to the array if cache doesn't have Minimum Free (in Global Share Settings). Mover never moves cache-prefer to the array, it only moves from array to cache when cache has room.

Thanks for the clarification. I thought everything is controlled by the mover.

Link to comment
3 minutes ago, PzrrL said:

So when exactly will the overflowed writes write back to cache? By invoking mover? Is this auto or manual?

Mover is the way things get moved based on the use cache settings. Mover can be invoked manually. It also runs on schedule, the default is daily in the middle of the night. Mover is intended for idle time. There is also a plugin to invoke mover based on how full cache is.

Link to comment

@testdasi Thanks so much for all your answer!!! Since I have a combination of different drives (WD Green, Black and white labels), my original thought is to put heavy I/O work (such as parity) to white labels and black (for appdata), and to use Green as archive/BT drive(since they are old already). That's why I am having the concept of achieving 1 task by using particular disk. Thanks for all the clarification!!!

 

@trurl Thanks for your info on cache!!!

Link to comment

Join the conversation

You can post now and register later. If you have an account, sign in now to post with your account.
Note: Your post will require moderator approval before it will be visible.

Guest
Reply to this topic...

×   Pasted as rich text.   Restore formatting

  Only 75 emoji are allowed.

×   Your link has been automatically embedded.   Display as a link instead

×   Your previous content has been restored.   Clear editor

×   You cannot paste images directly. Upload or insert images from URL.