To Cache drive or not to Cache drive?


Recommended Posts

Secondly, I couldn't help noticing the section above, from the description of the Tepid Spare.  (great idea by the way!)  As I know you know, you cannot just remove a drive from an unRAID array (step 3 above).  So I wonder if a better procedure would be to replace the Tepid spare with the new drive, and let it rebuild on to it.  That way, the array stays protected, and you no longer need Step 1, copying all of the data to the new drive.  Plus, the data in question is fully backed up on the Tepid Spare while it is being rebuilt onto the new drive, in case anything should go wrong with the rebuild.

 

And there's a 'drive drive' in the first line of the Warm Spare paragraph.

 

I'm still new to unRAID, but I would've thought you can remove cache drives because they're not REALLY in the array... I guess I'm wrong. Are you saying, once you implement cache drives in your array, you can't ever remove it anymore without rebuilding the entire parity? I guess the same would apply if someone were to shrink their array size (for whatever reason), right?

yuo can stop the array and un-assign/assign  the cache drive at any time.  Parity does not need to be rebuilt.
Link to comment
  • Replies 366
  • Created
  • Last Reply

Top Posters In This Topic

Top Posters In This Topic

Posted Images

Right.  What RobJ pointed out is that in my original instructions (which he has quoted above) I forgot to add the step in which you assign the new drive to the array slot previously occupied by the tepid spare.  If you read what RobJ quoted, it sounds like I am telling you to remove a drive from the array without replacing it (which would make you lose parity protection).  If you look back at the first post in this thread you'll see that I've fixed the omission and added a step in which you assign the new drive into the array to take the place of the tepid spare.  Thanks again to RobJ for pointing out this error.

Link to comment

Right.  What RobJ pointed out is that in my original instructions (which he has quoted above) I forgot to add the step in which you assign the new drive to the array slot previously occupied by the tepid spare.  If you read what RobJ quoted, it sounds like I am telling you to remove a drive from the array without replacing it (which would make you lose parity protection).  If you look back at the first post in this thread you'll see that I've fixed the omission and added a step in which you assign the new drive into the array to take the place of the tepid spare.  Thanks again to RobJ for pointing out this error.

 

Took me to compare between the old and new instructions to understand what was going on. I see now, I think I kinda confused myself the first time :P

Link to comment

is there any real benefit using a green drive as the cache drive?  all the talk is 7200 and up, so i was just curious if there is gain.  thinking about the warm spare route

 

the talk is from those trying to get the absolute fastest write speed to the array.  It is a different need than to those trying to keep energy use down.  As a warm spare, a drive similar to those in the array is best.
Link to comment

I agree with Joe L.  From what I have seen, a green drive is fine, but if you have a number of disks in your system you can often see a speed benefit by making sure that the cache drive is connected to a motherboard controller rather than via a controller in a PCI-e or PCI slot.

Link to comment

Hi Folks

 

I'm new here and this is my first post  :)

 

I was wondering since...

(a) one of the dangers of using the cache drive is that the data is unprotected until it is moved to the parity-protected-array, and;

(b) there is a (slight) potential of trouble if both the array and the cache disk fill up...

 

why not just run the mover script continuously (ie as soon as some data comes into the cache drive, start moving it to the parity-protected-array immediately)?

 

I was wondering why the caching system wasn't written this way in the first place?  (Of course you could kind of simulate this by running the mover script on a fairly frequent schedule).

 

The performance benefit of the cache drive is to (1) quickly write the new data to the cache drive and acknowledge back to the application doing the writing then (2) slowly write the data with parity calculated to the array in the background.  Why is step (2) done in batch fashion at 3:40am each night? (I know it can be run more frequently, but I'm wondering why it doesn't just happen immediately/continuously).  Wouldn't that be safer?

 

Sorry if I'm missing some fundamental point (I'm good at that! ;)

 

Cheers

 

Toby

 

Link to comment

why not just run the mover script continuously (ie as soon as some data comes into the cache drive, start moving it to the parity-protected-array immediately)?

 

I was wondering why the caching system wasn't written this way in the first place?  (Of course you could kind of simulate this by running the mover script on a fairly frequent schedule).

 

The performance benefit of the cache drive is to (1) quickly write the new data to the cache drive and acknowledge back to the application doing the writing then (2) slowly write the data with parity calculated to the array in the background.  Why is step (2) done in batch fashion at 3:40am each night? (I know it can be run more frequently, but I'm wondering why it doesn't just happen immediately/continuously).  Wouldn't that be safer?

 

You use a cache drive to speed up writes. If you write data to the cache drive and move that data to parity at the same time, then you're going to get slow write speeds (hard drives aren't exactly good at random read/write). It would probably have been better if you just forgo the cache drive and just written the data directly to the array.

Link to comment

Because you are sending back an acknowledgement to your application (saying "ok I've written your data") as soon as it hits the cache drive, and since you're not waiting for it to also hit the parity-array (2 different steps), then your application won't have to wait so long and it'll run faster (no perceived delay caused by the parity writes).  So I'm wondering why unraid doesn't do it this way (ie write to cache disk, acknowledge back to application, flush cache to parity-array asap).  Hmmm...

Link to comment

Because you are sending back an acknowledgement to your application (saying "ok I've written your data") as soon as it hits the cache drive, and since you're not waiting for it to also hit the parity-array (2 different steps), then your application won't have to wait so long and it'll run faster (no perceived delay caused by the parity writes).  So I'm wondering why unraid doesn't do it this way (ie write to cache disk, acknowledge back to application, flush cache to parity-array asap).  Hmmm...

it can, you just need to change the default "mover" schedule in the settings to where it checks for files to be moved more frequently.  The mover will NOT move a file that is open for reading or writing, so it is safe to invoke it more frequently.
Link to comment

So I'm wondering why unraid doesn't do it this way (ie write to cache disk, acknowledge back to application, flush cache to parity-array asap).  Hmmm...

 

Probably to prevent situations when you're writing to the cache drive while it's moving data to the parity-protected array. What if you had just finished a 100GB copy to the cache drive, mover script runs and then you copy another 100GB to the cache drive? Yeah, not fun. Unless you're nocturnal, there's less chances of that happening at 3:40AM. :P

Link to comment

I run the mover once a week. I will likely never fill my 500gig cache drive in a week so why not. Then, when I access the new media the server doesn't spin-up the data drives. I have set the more critical shares, like pictures and storage, to not use the cache drive. The TV and Movies I'd lose in a week can be replaced.

 

Peter

Link to comment

Thanks Peter,

 

The reminder about selectively enabling the use of the cache for some shares but not others is very helpful.  And turning the cache on and off on a per share basis according to need eliminates the need to expose the disk shares with potential for confusion (especially if other users / family members) are given access.

 

Les.

Link to comment

@Joe L ... good to know the option exists for a semi-continuous running of the mover script.

 

@lionelhutz/S80_UK ... good reminder to me that everyone's usage patterns may differ.  I personally have new data writing 24/7 so it might actually be best for me to run without a cache drive altogether for maximum protection (performance isn't a major consideration for me anyway because I currently do a lot of transfers via wireless-N and maybe only get a maximum of 10MB/sec, but writing to the parity-array using wired GbE should get me at least 14MB/sec).

 

@ilovejedd ... correct me if I'm wrong, but I think you're saying that the real issue with continuously moving the data from the cache drive to the parity-array is contention (competition/thrashing) of the cache drive between the application(s) writing to the cache drive and the mover script reading/writing from/to the cache drive (writing because it's moving the data off the cache drive).  I believe you're saying all that thrashing will significantly impact performance and remove any (speed) benefits of having a cache drive?  If that's true, then I think you've answered my original question as to why the mover script doesn't run continuously.  Thanks!

 

Cheers

 

Toby

Link to comment

Exactly, the cache disk is seeking between 2 operations and it will significantly slow down.

 

You would be better running without a cache disk if you need such quick protection for a drive failure.

 

Honestly, you might be better off with a RAID6 array for the transient data and using the unRAID server for archiving or backup. You really need to figure out the intended usage before deciding on the solution. And unRAID is intended for media storage where write speed was not the main design consideration. Right from the homepage - "unRAID Server is an embedded Network Attached Storage server Operating System designed to boot from a USB Flash device and specifically designed for digital media storage:"

 

Peter

 

Link to comment

@Peter...

 

This reminds me of one of *the* fundamentals of computer sales I learnt a loooong time ago when I first entered the I.T. industry ... the first question to a customer should always be "what do you want to use it for?".

 

And even within one main design goal such as Unraid being targetted for media storage/serving, it's still important to ask this and similar questions (such as *how* are you intending on using this?).

 

In my case, I actually am using it for media storage/serving (hence why I was drawn to the product), but it's a learning curve and I'm trying to understand if/how I'd use the cache drive.

 

Since,

(a) write performance isn't my top priority, and in fact parity-writing without cache drive will be faster than my current method anyway, and;

(b) sooner-rather-than-later data protection is moderately important to me (because I'm lazy and don't want the hassle of re-creating/re-downloading the data - not because the data is in any way "mission critical" (far from it!!), and;

© the other stated side-benefits of the cache drive (eg warm spare, tepid spare (funny term that !!), place to put apps etc) aren't of any real importance to me...

 

I guess I'll just go with the KISS principle and forego the cache drive altogether and happily write directly to the parity-array at "reduced" speeds of around 14MB/sec.

 

It's a (fun) journey...

 

Thanks for your help folks!

 

Cheers

 

Toby

Link to comment
  • 4 weeks later...

At 14MB/s it would take forever for me to backup my windows machines or rip a BD or transfer data. That is only 112mb/s. I using a seagat 5900 rom cache drive and I get 800mb/s to 880mb/s transfer rates ove rmy network. I realized without using a cache drive I could not transfer the terabytes i data i prodcuce every. Although transfers to my array are around 30MB/s from my cache drive, but that is still only 240mb/s, which is still to slow for me to do my backups for my Windows 7 machines. So in teh end I was glad I decided to use a cache drive.

Link to comment

@Peter...

 

In my case, I actually am using it for media storage/serving (hence why I was drawn to the product), but it's a learning curve and I'm trying to understand if/how I'd use the cache drive.

 

 

I guess I'll just go with the KISS principle and forego the cache drive altogether and happily write directly to the parity-array at "reduced" speeds of around 14MB/sec.

 

 

 

The difference in performance and flexibilty using a cache drive is such that I can't believe anyone not wanting to use one.

 

I can move files to the unraid box very quickly (up to 70mBytes/sec from one of my computers) and the moved files are immediately available to other users in their appropriate share. The system moves the cached files to the array every 12 hours. Admittedly I am using a Pro licence so that using a valuable drive slot for the cache is not an issue, but the performance gains would be worth it even on the plus version.

 

IMHO

Link to comment

I'm using 0 split level, and created a FLAC directory at the top of my cache drive. In my share, my FLAC directory is 3 levels deep. It copied my FLAC directory to one of my drives at the top, but it was also in the 3 deep level directory. How exactly does the mover script, and the 0 split level work? i thought it allows for a top level directory among many drives and then second level directories among any number of those drives so you can evenly distribute your categories. Sorry. I read then Unofficial doc but it's still not 100% clear :/

Link to comment

A thought ...

 

One of the main concerns with the cache drive is contention between writing to it and reading from it (as the data is moved to the array) at the same time (hence why the mover script defaults to running at 3:40am once per day to give the lowest chance of competing with any writes).

 

I wonder how easy it would be to have some sort of watchdog script which automatically starts/stops the mover script whenever the cache drive is idle/busy...?  Then there's never any contention, no need to have the mover run on some schedule, and the data is sitting unprotected on the cache for the least amount of time.

Link to comment

There is such a script and it was implemented though I do not feel like searching for it right now.

 

The one problem with this approach is that my cache drive is never not busy.  I have my torrents going to that drive, so it never spins down, and is never not busy.

 

The mover script will not move data that is being used, so there is no issue of data being partially moved.

Link to comment

So perhaps the best combination would be ...

1) Cache drive is based on SSDs or 10K RPM or 7.2K RPM drives (for maximum performance);

2) Cache drive is actually a mirrored RAID 1 logical drive, so data is always protected (not as fast as a single drive, but should be much faster than writing directly to the unraid array, especially if using SSDs);

3) A watchdog script which calls the mover script whenever the cache drive is idle (so cache drive could be smaller (SSDs) because the data is being moved off the drive more regularly, except if writing to the cache drive continuously eg torrents);

4) The mover script additionally runs at 3:40am every day (to essentially force-move all files to the array (except any open files)).

 

???

Link to comment

Join the conversation

You can post now and register later. If you have an account, sign in now to post with your account.
Note: Your post will require moderator approval before it will be visible.

Guest
Reply to this topic...

×   Pasted as rich text.   Restore formatting

  Only 75 emoji are allowed.

×   Your link has been automatically embedded.   Display as a link instead

×   Your previous content has been restored.   Clear editor

×   You cannot paste images directly. Upload or insert images from URL.