Cache Pool and btrfs RAID 1 performances


Recommended Posts

I am developing and testing multiples VMs from my array, all mechanical drives and no cache drive assigned so far. I really need to add some SSDs to my system because current performances are abysmal. 

 

My upgrade path seems to be going in the direction of adding SSDs in a cache pool but I still have some doubts about going down that path.  The official statement from Limetech about caches pools says this about SSDs in a cache pool: 

Quote

Use SSDs in a cache pool for the ultimate combination of functionality, performance, and protection.

 

That sounds fabulous but after doing a lot of research I found out that most people are concerned about protection or usable storage size of the cache pool. I cannot find any information about the actual performance that we would expect from btrfs RAID 1. I have seen somewhere that btrfs RAID 1 does not behave like conventional RAID so I do not know what to expect here in terms of performance. Would running a single VM benefits from using multiple SSDs from a cache pool? I really wish that btrfs RAID 1 reading speed would systemically be multiplied by the number of devices, like two SDDs with respective reading speed of 500 Mb/s would systemically deliver reading speed of 1000 Mb/s out of the mirror. Did someone tested btrfs RAID 1 performance and has some benchmarks to share with me? I am ready to give a shot at btrfs RAID 1 with 2 SATA SSDs but I really wonder I there are better options out there.

 

My other option would be to store my domains VMs on SSDs part of my data array but the official recommendation says:

Quote

SSD support in the array is experimental

I do not understand about what could possibly go wrong but I guess I am better not go against this advice.  This is too bad because a single high-end M.2 PCIe drive such as Samsung Evo 960 would probably deliver the kind of performance I am looking for, at least for the read requests. Let's face it, SSD SATA drives will eventually become part of history because the SATA interface is saturating the transfer speed of the fastest NVMe SSD. I feel that a Samsung Evo 960 is a better future proof investment than pooling a bunch of SATA SDDs.        

 

This is just for development and testing, so I am ready to sacrifice some level of protection against better performance. I am backing up my VMs on an unassigned external hard drive in case the whole array goes south.

 

I might also consider testing performance VMs passing-trough SSDs and possibly run FreeNAS/Openfiler as guest OS using my SSDs with proper RAID configuration. Not using a barebone metal machine for FreeNAS/Openfiler is not officially approved for production environment so I am not so sure if this a great idea. I am been running FreeNAS on my Unraid server for a while without a glitch so I am still considering this option for my home lab.

 

Anyone has been facing the same situation as me?

Link to comment

Thanks trurl for the reply! :D 

 

I did get the concept that UnRaid is not RAID, that nothing is stripped, in fact that was the reason that convinced me to go for UnRaid because I have a lot archive data that just need to sleep on non-spinning disks! I did solved the problems for my huge archives of media, that is not my concern anymore.

 

Now I am dealing with the problem of running VMs with decent performances. My 5 VMs are stored on a single WD Red and they a limited to very slow reading speeds. When I do a lot of parallel reading operations from all my VMs, like starting all of them, it is not smooth at all.  

 

I am still not sure about my upgrade path.   Having a single SSD SATA part of the array should in theory delivers reading speed of 500 mb/s and a SSD Nvme PCIe may deliver more than 2000 mb/s from that device alone. I am aware of the lag caused by the parity that needs to be calculated and then written on the parity drive, but that should not be a reading bottleneck for a SSD part of the array, I have not tested that but that is my understanding.   Like I said, I would be pleased to have an SSD on the data array if I would at least gain some reading performance for my VMs, knowingly that data on those SSD won’t be striped like in normal RAID. I could do some manual file placement on several SSDs, I assume that this would give me some real reading speed benefits. As for improving writing speed a pool cache is what I am looking for. My main concern about having SSDs assigned to my array is is that SSDs are not officially supported for the data array so I may have to forget about that option.              

 

Talking about the pool cache, knowing about other supported  RAID configuration on btrfs is interesting, but like I said all the conversation are focusing on levels of protection or the size of the pool, no details about expected performances. I would like to have some numbers about how btrfs RAID 1 or RAID 0 performs. What is troubling me is that I have seen on a post that btrfs RAID 1 is not systematically reading from two mirrored devices at the same time, instead multiple concurrent OS process requesting read operations are pushing the RAID 1 pool to read from multiple devices at the same time. I am not sure how btrfs behaves.   A perfect RAID 1 implementation, let’s say with two devices, would deliver reading speed of the two devices combined, leaving the writing speed not improved but my question remains: 

Can we really achieve RAID 1 reading performance with btrfs?      

 

Edited by ReneP
Wrong font size
Link to comment

As you have found running VMs off array disks will never get good performance because in practise they all do a significant amount of writes and these are a lot slower to array drives.    Using the cache pool with SSD is frequently used to get better performance than using array drives, but the cache does not get speed improvements by adding additional drives to the cache pool - they are used to provide protection and/or to get additional space in the cache.

 

you have omitted one scenario that may well be best for you - that is to use SSD drives for your VMs that are not part of the cache or array (and are typically mounted using the Unassigned Devices plugin).    The downside of this approach is that you become responsible for handling the backup of the VMs (probably to an array disk).

  • Upvote 1
Link to comment

Thanks itimpi for your feedback!

 

So you are confirming my fear that adding more SSDs to the cache pool will not provide any performance benefit. That is a major deception but I can live with that. I am recapping here my options for storing my VMs:

 

1)    Have unRAID to create a cache pool with two SSDs: Adding additional SSDs to the pool won’t improve performance. Compared to using the hard drive from my array, I would still have a performance gain but I won’t be able to scale up the performance if ever needed. I will be limited to the IO of a single SDD. This is still worth a try because having a Samsung 960 EVO used as a cache would at least provide several times the performance of a single hard drive on my array.  

      

2)   Unassigned SSDs: By manually placing my VMs files on different Unassigned SSDs I would get all IO (read/writes) performances inherent to those respective SSDs. This would come at the expense of not having my VMs protected so I would need to set a manual back up solution.  This really sounds like a simple way to scale up and maximize IO with parallel processing so I will go for this option.    

 

3)   Create an hardware RAID 1 pool of SSDs and have unRAID to use that pool as a cache. I am not even sure that this recommended or possible. My feeling is that unRAID is not really friendly to hardware RAID.

 

4)   Run guest FreeNAS/Openfiler on my unRAID host and passthrough SSDs. With ZFS I would have the freedom in theory to explore all possible RAID configurations and run some benchmarks. This is not a recommended setup for production but for my home lab it might be worth at least to run some tests.          

 

I still need to factor cost into my decision. If interested I will publish my test results once I upgrade my server with SSDs.

Link to comment

The BTRFS cache pool isn't limited to using RAID level 1. RAID 0 and RAID 10 are options too. You might want to experiment with them to see how throughput is affected.

 

I understand the reason SSDs are not supported in the main array is that the trim operation would corrupt parity.

  • Like 1
Link to comment

@johnnie.black Your experience seems to be conflicting with other tips I have seen so far. @itimpi said that adding more devices to the cache pool would not change anything for performances.    My initial feeling was that if I assign more than one device for the cache pool then I should expect btrfs RAID1 to deliver better reading speed without any improvement on writing speed.  I am still confused about how running multiple processes can translate into reading operations from multiple devices at the same time. Is running a single VM requires one process or multiples processes? I guess I need to do more tests to find out.  Last thing I want is to go back to FreeNAS for storing my VMs. My VMs lab is a bunch of databases for BI projects, some are doing heavy readings others are more into writings so spreading my IO efficiently is important for me.              

 

@John_M If TRIM can be disabled would this make it safe to use SDDs in the data array? 

 

@tjb_altf4 your setup with multiple 960 look so cool, that is what I had in mind to. Too bad you have not run some throughout benchmarks yet, if speed improvements for btrfs raid 1 and 10  are indeed in the pipeline then it would be great to have some kind of baseline. In theory your RAID-0 with two 960 should deliver twice more speed for both reading and writing operations, I am just curious to know if this what you are getting now. Let me know it if you dare to plunge for a RAID-10 of Samsung 960, I want to know if it is all worth the money.        
 

Link to comment
4 minutes ago, ReneP said:

If TRIM can be disabled would this make it safe to use SDDs in the data array? 

TRIM is already disabled in the array. I think the issue is more about any particular SSDs implementation of garbage collection and whether it might invalidate parity. Not sure how valid this concern is.

  • Like 1
Link to comment

Well, if reading from an SSD starts returning blocks of zeros where previously it returned blocks of data, due to its internal garbage collection there is no way to keep parity in sync. And if it is prevented from performing garbage collection its write performance will drop over time. Admittedly, this is an oversimplification but I think the principles are valid.

  • Like 1
Link to comment
1 hour ago, ReneP said:

 I am still confused about how running multiple processes can translate into reading operations from multiple devices at the same time.

That's the way btrfs currently reads from multiple devices, it's based on the process PID, first one will read from a device, next process from the next and so on.

  • Like 1
Link to comment
1 minute ago, johnnie.black said:

That's the way btrfs currently reads from multiple devices, it's based on the process PID, first one will read from a device, next process from the next and so on.

Just to add this is valid for mirrors only, e.g., raid1, for raid0/10 data is stripped to multiple disks, so a single process will read from multiple devices.

  • Like 1
Link to comment
  • 3 weeks later...
On 3/7/2018 at 2:08 PM, ReneP said:

@johnnie.black Your experience seems to be conflicting with other tips I have seen so far. @itimpi said that adding more devices to the cache pool would not change anything for performances.    My initial feeling was that if I assign more than one device for the cache pool then I should expect btrfs RAID1 to deliver better reading speed without any improvement on writing speed.
 

 

By default the cache pool runs raid 1, I believe. Someone please correct me if this is wrong. But you can run other raid levels on the cache pool if desired, just requires using the command line a bit.

Link to comment
8 minutes ago, MadMage999 said:

By default the cache pool runs raid 1, I believe. Someone please correct me if this is wrong. But you can run other raid levels on the cache pool if desired, just requires using the command line a bit.

 

The necessary switches can be entered in the webUI. See johnnie.black FAQ entry here:

 

https://lime-technology.com/forums/topic/46802-faq-for-unraid-v6/#comment-480421

 

  • Upvote 1
Link to comment
On 3/7/2018 at 5:11 AM, John_M said:

The BTRFS cache pool isn't limited to using RAID level 1. RAID 0 and RAID 10 are options too. You might want to experiment with them to see how throughput is affected.

JBOD can also be added to this list - just fyi for anyone that cares.

Edited by Jcloud
Link to comment
  • 1 month later...
  • 1 month later...

I am also interested in the performances of different BTRFS RAID levels. I am about to upgrade the unRAID 6.4.1 motherboard to a Gigabyte Designare EX which has 3 onboard NVME slots. I have 3 Samsung 960 drives I would like to use and not sure of the best way to configure. I was thinking about RAID 5, but still see the disclaimer:

"RAID5/6 are still considered experimental and not ready for production servers, though most serious issues have been fixed on current kernel at this of this edit 4.14.x"

 

This is a production server so I hesitate on using RAID5 unless this is old information.

 

The motherboard has onboard RAID for the NVME, but I'm not sure of reliability of unRAID using cache single with the motherboard providing underlying RAID to the 3 NVME drives.

 

What do you guys recommend for best performance and space using 3 NVME drives? (I am backing up cache data regularly to the array)

Are there still issues with RAID5 for cache on 6.4.1?

Link to comment
4 minutes ago, guru69 said:

This is a production server so I hesitate on using RAID5 unless this is old information.

Most raid5/6 issues are resolved now, but it suffers from the write hole problem that affects almost all raid5/6 setups.

 

5 minutes ago, guru69 said:

What do you guys recommend for best performance and space using 3 NVME drives?

3 is not a good number, you'd need 4 for raid10, so assuming regular backups probably would use raid0 (with raid1 for metadata)

Link to comment

Ahh yes, the write hole problem. I will forget RAID5 then. You're right, 3 is not a good number. Would it make more sense to pickup one more SSD and use RAID10? Am I right in assuming RAID10 will provide the best write speeds/reliability for the cache pool?

Link to comment

Join the conversation

You can post now and register later. If you have an account, sign in now to post with your account.
Note: Your post will require moderator approval before it will be visible.

Guest
Reply to this topic...

×   Pasted as rich text.   Restore formatting

  Only 75 emoji are allowed.

×   Your link has been automatically embedded.   Display as a link instead

×   Your previous content has been restored.   Clear editor

×   You cannot paste images directly. Upload or insert images from URL.