Jump to content
Hikakiller

Unraid not using ram cache

28 posts in this topic Last Reply

Recommended Posts

I've noticed unraid doesn't seem to be using my ram cache, or it's using it in small bursts. 

When writing to my array over 10GBe network, it has deep troughs of speed. It will start out writing, say, a movie, at 200MB/s, but basically stop every 2 seconds. During this stop, I see disk usage go up on the server. 

Why not either write all of the file directly to the disks, or write all of the files to the ram and complete the move in the background? 

 

I have 256gb ram, and two 2643v2. I'm doubting it's a lack of server resources. 

On top of that, the files are usually <10gb. That's small compared to my available ram. 

 

Total ram allocated to vms is 20gb, 4gb for Ubuntu and 16gb for windows, with however much a deluge-vpn docker needs being allocated dynamically. 

 

Is there anything I can do to increase performance? 

Does it have anything to do with the settings such as nr_requests, md_sync_window, etc? 

Thanks.

Share this post


Link to post
Posted (edited)

RAM cache is mostly controlled by these:

 

sysctl vm.dirty_ratio - default is 20% free RAM

sysctl vm.dirty_background_ratio - default is 10% free RAM

 

You can experiment and set them higher but note that performance will likely suffer when more data needs to be committed.

 

See here for an explanation what theses values change:

https://lonesysadmin.net/2013/12/22/better-linux-disk-caching-performance-vm-dirty_ratio/

 

 

 

 

 

Edited by johnnie.black

Share this post


Link to post
Posted (edited)
19 minutes ago, johnnie.black said:

RAM cache is mostly controlled by these:

 

sysctl vm.dirty_ratio - default is 20% free RAM

sysctl vm.dirty_background_ratio - default is 10% free RAM

 

You can experiment and set them higher but note that performance will likely suffer when more data needs to be committed.

 

See here for an explanation what theses values change:

https://lonesysadmin.net/2013/12/22/better-linux-disk-caching-performance-vm-dirty_ratio/

 

 

 

 

 

Hmm, even at default ratios, those amounts would be hugely more than my file transfer sizes. Maybe something else is the issue? 

Even transferring world of warcraft (classic) over the network take minutes, not tens of seconds. It's a 4.5gb file, and I'm seeing a peak of 200MB/s, then a drop to 5, etc. At default values it should load into ram instantly, basically. 

Edited by Hikakiller

Share this post


Link to post
13 minutes ago, Hikakiller said:

Maybe something else is the issue? 

Likely yes, you can run iperf with a single stream to rule out network.

Share this post


Link to post
5 hours ago, johnnie.black said:

Likely yes, you can run iperf with a single stream to rule out network.

Isn't iperf for network speed testing? I'm getting full 10GBe. And I've also tried it on two vms on the array, with two seperate ssds. 

Share this post


Link to post
49 minutes ago, Hikakiller said:

I'm getting full 10GBe.

Sorry, misunderstood because of this:

7 hours ago, Hikakiller said:

It will start out writing, say, a movie, at 200MB/s,

I would expect the initial speed to be close to 1GB/s, that's what I get while it's being cached.

Share this post


Link to post
Posted (edited)
44 minutes ago, johnnie.black said:

Sorry, misunderstood because of this:

I would expect the initial speed to be close to 1GB/s, that's what I get while it's being cached.

I suppose I could be getting less than 10GBe, but I think that's just my cache drives maxing out. I've got dual WD Blacks in raid 0. 

 

I am running on HDDs for most of my array. If I throw my 970 evo plus in there, or my 850 pro, I can write to them faster than 200MB/s. But that's not the main issue. 

 

I have gone above 1gb, measured by windows task manager. And both sides of the link show 10gbe.

 

I'll try upping and lowering the caching ratios, but I don't know which would be ideal. I'm on a dual ups backup with about 20m of runtime each, so I guess I could up it. 

But how often do you copy something you only have one single copy of? 

 

I'm guessing that I can calculate max ram amount set to caching if I multiply my disk speed by my battery backup runtime, and subtract like 20% for safety, then I should have enough time for the os to destage the cache. 

Edited by Hikakiller

Share this post


Link to post
Posted (edited)
16 hours ago, johnnie.black said:

RAM cache is mostly controlled by these:

 

sysctl vm.dirty_ratio - default is 20% free RAM

sysctl vm.dirty_background_ratio - default is 10% free RAM

 

You can experiment and set them higher but note that performance will likely suffer when more data needs to be committed.

 

See here for an explanation what theses values change:

https://lonesysadmin.net/2013/12/22/better-linux-disk-caching-performance-vm-dirty_ratio/

 

 

 

 

 

I tried this. I sst background ratio to 5% and dirty ratio to 90% and transferred a 5gb file. It took 25 seconds. 

I also set the flush time to 8 minutes for good measure, it didn't change anything.

 

 

I'm quite confused about the naming scheme here. Wouldn't these values be mainly used by vms? 

Edited by Hikakiller

Share this post


Link to post
Posted (edited)

If you want max write to cache before flush to media, you should try dirty ratio 90% and background ratio 89%. ( Suppose you know the risk )

Edited by Benson

Share this post


Link to post
11 hours ago, Hikakiller said:

And both sides of the link show 10gbe.

Link speed doesn't really matter, why I suggested using iperf to test network bandwidth.

Share this post


Link to post
7 hours ago, Benson said:

If you want max write to cache before flush to media, you should try dirty ratio 90% and background ratio 89%. ( Suppose you know the risk )

Yeah, I do. Thanks, I'll try that. 

 

6 hours ago, johnnie.black said:

Link speed doesn't really matter, why I suggested using iperf to test network bandwidth.

Alright, I'll do that. 

Share this post


Link to post
6 hours ago, johnnie.black said:

Link speed doesn't really matter, why I suggested using iperf to test network bandwidth.

image.thumb.png.daf7297f37e3025265b76a9a74a29fba.pngHere are my results. 

Share this post


Link to post
Posted (edited)
On 10/7/2019 at 7:38 AM, johnnie.black said:

RAM cache is mostly controlled by these:

 

sysctl vm.dirty_ratio - default is 20% free RAM

sysctl vm.dirty_background_ratio - default is 10% free RAM

 

You can experiment and set them higher but note that performance will likely suffer when more data needs to be committed.

 

See here for an explanation what theses values change:

https://lonesysadmin.net/2013/12/22/better-linux-disk-caching-performance-vm-dirty_ratio/

I almost hate to say this but I believe this is a bit in error.  These settings (as I recall) control the amount of RAM is permanently set aside for the delayed-write to--disk buffer so that lack of user perceived response during disk writes was minimized-- typically, typing on the keyboard and not having the characters appear on the screen until several seconds later when they suddenly appeared in a burst.  (This was back in the days of much slower single core CPU's and much smaller amounts of RAM.)  As I understand, Unraid is suppose to automatically use any unallocated RAM as a RAM cache.   But with the addition of Dockers and VM's, I wonder if, somehow, there are some other factors now entering into how this is actually implemented.  Perhaps, someone from @limetech could comment.  

Edited by Frank1940

Share this post


Link to post
4 hours ago, Frank1940 said:

As I understand, Unraid is suppose to automatically use any unallocated RAM as a RAM cache.

It will use any unallocated RAM for read cache, write cache (dirty RAM) is controlled by those variables, there's also another for max time before flush.

Share this post


Link to post

Interesting, look at the output of this command:

root@Rose:~# sysctl -a | grep dirty
vm.dirty_background_bytes = 0
vm.dirty_background_ratio = 1
vm.dirty_bytes = 0
vm.dirty_expire_centisecs = 3000
vm.dirty_ratio = 2
vm.dirty_writeback_centisecs = 500
vm.dirtytime_expire_seconds = 43200

I understand why the vm.dirty_background ratio is 1% and vm.dirty_ratio is 2% (that is what I set them to  awhile back to prevent out-of-memory issues) but the vm-dirtytime_expire_seconds of 43200 is 12 hours!  That says the data could sit there for 12 hours before it is forced to disk...

Share this post


Link to post
18 minutes ago, Frank1940 said:

but the vm-dirtytime_expire_seconds of 43200 is 12 hours!

 

18 minutes ago, Frank1940 said:

vm.dirty_expire_centisecs = 3000

This is the one about max time before flush, so 30 secs.

Share this post


Link to post
53 minutes ago, johnnie.black said:

Don't understand how results can be faster than 10GbE.

Sorry, this is a vm on the unraid array. 

 

Share this post


Link to post
9 minutes ago, Hikakiller said:

Sorry, this is a vm on the unraid array. 

So no point in doing that, you need to test with the computer you're copying from as source.

Share this post


Link to post
Posted (edited)
24 minutes ago, johnnie.black said:

So no point in doing that, you need to test with the computer you're copying from as source.

Ah, sorry. My main windows computer is virtualized on my unraid machine. 

Edited by Hikakiller

Share this post


Link to post
9 hours ago, Frank1940 said:

I almost hate to say this but I believe this is a bit in error.  These settings (as I recall) control the amount of RAM is permanently set aside for the delayed-write to--disk buffer so that lack of user perceived response during disk writes was minimized-- typically, typing on the keyboard and not having the characters appear on the screen until several seconds later when they suddenly appeared in a burst.  (This was back in the days of much slower single core CPU's and much smaller amounts of RAM.)  As I understand, Unraid is suppose to automatically use any unallocated RAM as a RAM cache.   But with the addition of Dockers and VM's, I wonder if, somehow, there are some other factors now entering into how this is actually implemented.  Perhaps, someone from @limetech could comment.  

I kind of suspected that this isn't related to network cache size. Look at this post. 

@johnnie.black

Are you sure these are the correct values? There's even a value for tcp file caching over the network in sysctl. 

Screenshot_20191008-132435_Chrome.jpg

Share this post


Link to post
21 hours ago, Benson said:

If you want max write to cache before flush to media, you should try dirty ratio 90% and background ratio 89%. ( Suppose you know the risk )

Sure, but from my understanding background ratio is when it starts to flush cache, and dirty ratio is when it hard stops and forces a flush. So even at 1% hard flush I should be getting 2gbs of transfer at 10gbe, and I'm not getting that at 90% hard cap. 

Share this post


Link to post
Posted (edited)
54 minutes ago, Hikakiller said:

Sure, but from my understanding background ratio is when it starts to flush cache, and dirty ratio is when it hard stops and forces a flush. So even at 1% hard flush I should be getting 2gbs of transfer at 10gbe, and I'm not getting that at 90% hard cap. 

The issue was physical media haven't enough speed, so once flush to media then transfer will slowdown. Thats why background.ratio also need increase. Writeback_centisecs may need increase too.

 

But you need to know, those only for max write to cache before flush to media, overall won't increase performance. Only some situration useful.

 

For example, sometimes I will transfer log of data to array, I will transfer each 30GB to memory cache, then hash them (I hash all file) and let them slowly flush to array in background. After all complete, I will clear all cache and do next round transfer.

Edited by Benson

Share this post


Link to post
20 hours ago, Benson said:

The issue was physical media haven't enough speed, so once flush to media then transfer will slowdown. Thats why background.ratio also need increase. Writeback_centisecs may need increase too.

 

But you need to know, those only for max write to cache before flush to media, overall won't increase performance. Only some situration useful.

 

For example, sometimes I will transfer log of data to array, I will transfer each 30GB to memory cache, then hash them (I hash all file) and let them slowly flush to array in background. After all complete, I will clear all cache and do next round transfer.

Right, but I'm seeing a slowdown before 5% cap. So even if It were forcing a hard stop, I'm not seeing the slowdown at that hard cache flush. 

Share this post


Link to post

Join the conversation

You can post now and register later. If you have an account, sign in now to post with your account.
Note: Your post will require moderator approval before it will be visible.

Guest
Reply to this topic...

×   Pasted as rich text.   Restore formatting

  Only 75 emoji are allowed.

×   Your link has been automatically embedded.   Display as a link instead

×   Your previous content has been restored.   Clear editor

×   You cannot paste images directly. Upload or insert images from URL.