Turn entire array into BTRFS cache pool?

July 13, 201511 yr

I'm not sure where to post this question and I more than likely don't fully understand the technical merits or issues behind this question... but I was wondering... since we can now have a cache pool of drives running BTRFS (which I have read is a great file system for this purpose), having a write speed advantage over the parity disk approach used in the current unRAID array... why not just make the entire array run this way? That way there would:

1) be no need for a cache drive or cache pool to get full write speed all the time, without the extra pool of disks adding expense, heat and power consumption to the server

2) still be the ability to have different sized disks in the array

3) still be the ability to add or remove disks from the array as required and grow the array to a large number of disks

Maybe I'm missing something as I stated earlier...? I'm trying to understand what the drawbacks would be?

Quote

July 13, 201511 yr

Because too many issues exist with BTRFS today, including corruptions that are nonfixable. I don't trust it to even store my temp files.

Plus you loose the advantage of having unused drives spindown.

Quote

July 13, 201511 yr

Author

Because too many issues exist with BTRFS today, including corruptions that are nonfixable. I don't trust it to even store my temp files.

Plus you loose the advantage of having unused drives spindown.

If data corruption is such an issue, why are we using it for the cache pool then??

Also for me anyway, I never spin down the drives for several reasons... Main one is so that there is no annoying lag time waiting for a disk to spin up when accessing a file.

There must be some other reasons...

Quote

July 13, 201511 yr

I'm not sure where to post this question and I more than likely don't fully understand the technical merits or issues behind this question... but I was wondering... since we can now have a cache pool of drives running BTRFS (which I have read is a great file system for this purpose), having a write speed advantage over the parity disk approach used in the current unRAID array... why not just make the entire array run this way? That way there would:

1) be no need for a cache drive or cache pool to get full write speed all the time, without the extra pool of disks adding expense, heat and power consumption to the server

2) still be the ability to have different sized disks in the array

3) still be the ability to add or remove disks from the array as required and grow the array to a large number of disks

Maybe I'm missing something as I stated earlier...? I'm trying to understand what the drawbacks would be?

If this is what you are thinking why not just run a RAID-5 or RAID-6 Array to start with?

Quote

July 13, 201511 yr

Because too many issues exist with BTRFS today, including corruptions that are nonfixable. I don't trust it to even store my temp files.

Plus you loose the advantage of having unused drives spindown.

If data corruption is such an issue, why are we using it for the cache pool then??

Also for me anyway, I never spin down the drives for several reasons... Main one is so that there is no annoying lag time waiting for a disk to spin up when accessing a file.

There must be some other reasons...

You might be using it, but I and others are avoiding it like the plague.

If you dont want the typical unraid drive benefits, just run raid6 mdadmin arrays on ubuntu/debian or zfs on some other linux flavor.

Quote

July 13, 201511 yr

Author

Easy guys, I already prefaced my comments and questions with the fact that I don't have the technical depth that you have which is why I was asking the questions in the first place. Part of my question was asking what those technical benefits of the current unRAID array are vs the cache pool approach using the BTRFS file system on the disks. Based on the way the cache pool operation is described, it seemed as though it was also easily expandable with different sized disks, offered 1-disk failure level of redundancy, and it had the added benefit of faster write speed to the array... so what are the drawbacks? And this file corruption issue... if it's an issue, why is unRAID using it for the cache pool then?

Just trying to better inform myself, that's all.

Quote

July 13, 201511 yr

BTRFS is very new here, and we are just now gaining user experience with it. It has had so much promise (and hype), and seems like the next great thing for a future file system for unRAID. It did seem to work for most at first, with only a few reports of issues, but even with the fixes/patches LimeTech keeps adding, the stability is just not there yet, so neither is our confidence in it. I think initially that many of us were planning to move to it, but the negative reports just keep coming. Right now, I'd feel safer with a single Cache drive with XFS than a Cache Pool with BTRFS. And I'm sure that's very disappointing to Tom and Jon, and to all of us, that BTRFS isn't quite as ready as it had seemed previously.

Tom and Jon haven't made any statements about it yet, probably because they aren't ready to give up on it so soon. And to be fair, we really don't know how widespread the issues are. You only hear from the failures.

Quote

July 13, 201511 yr

And to be fair, we really don't know how widespread the issues are. You only hear from the failures.

I think perhaps the issue is in the recovery tools and techniques not being mature, and btrfs seems to be more fragile than reiserfs. I had an unmountable btrfs cache drive that was caused by a poorly timed lockup and subsequent hard reset. Instead of replaying the transaction log, it just crashed the btrfs module when it tried to mount. I was able to rescue it, but I spent many hours trying different options and techniques before I found the solution.

I think as btrfs ages and we get a better feel for the care and feeding it will become a non-issue. Best practices come about as a result of finding how things don't work.

Quote

July 14, 201511 yr

Author

BTRFS is very new here, and we are just now gaining user experience with it. It has had so much promise (and hype), and seems like the next great thing for a future file system for unRAID. It did seem to work for most at first, with only a few reports of issues, but even with the fixes/patches LimeTech keeps adding, the stability is just not there yet, so neither is our confidence in it. I think initially that many of us were planning to move to it, but the negative reports just keep coming. Right now, I'd feel safer with a single Cache drive with XFS than a Cache Pool with BTRFS. And I'm sure that's very disappointing to Tom and Jon, and to all of us, that BTRFS isn't quite as ready as it had seemed previously.

Tom and Jon haven't made any statements about it yet, probably because they aren't ready to give up on it so soon. And to be fair, we really don't know how widespread the issues are. You only hear from the failures.

Well, now I'm glad I asked the question!!! Reading through the v6 release notes and installation guide and the website the new cache pool sounds so good... but what you guys are telling me is that it's not yet ready for prime time and I should avoid it? It would be nice if LimeTech would make a statement one way or another on this!

Now I do not know what to do. I don't have any cache drives at all yet and was about to put in a couple of new disks to make a cache pool..

Let me ask you this - if one of the BTRFS disks in the pool fails, I'm assuming since there is redundancy between the two disks that the failed disk can easily be replaced and rebuilt just like a disk in the array right? Please correct me if I'm wrong... there is no documentation on that process. How is this situation handled and are there any other caveats to be aware of?

I would really like a good recommendation on what I should do for cache given that I'm starting from scratch.

Thanks!

Quote

July 14, 201511 yr

I just created a new 6.0 unRaid pro server and was surprised that that default cache drive format was btrfs. During the late 6 beta series it was still xfs. Tom must have some confidence in it to do that.

Quote

July 14, 201511 yr

I just created a new 6.0 unRaid pro server and was surprised that that default cache drive format was btrfs. During the late 6 beta series it was still xfs. Tom must have some confidence in it to do that.

i updated to the v6 series at beta14b (which I would consider pretty late in the beta series), and the default cache drive format was btrfs. I hadn't even checked the setting and was surprised after I formatted a new cache drive that it was btrfs. I promptly changed it to xfs and reformatted.

Quote

July 14, 201511 yr

The BTRFS cache pool is very similar to RAID1. Redundancy requires 2x the storage. So if you have 20T of data (5 4T drives), you would need another 20T (5 4T drives) to protect it. Adding another disk of data would require also adding another disk of redundancy.

UnRaid's parity protection requires one drive equal to the size to the largest disk in the array for redundancy (parity). So in the example above, unRaid world require 1 4T drive for redundancy rather than 5. And the array could continue to grow with no further redundancy.

Even if BTRFS were bug free and working perfectly, it is not a reason to move an unRaid array to a BTRFS structure.

Quote

July 14, 201511 yr

Author

The BTRFS cache pool is very similar to RAID1. Redundancy requires 2x the storage. So if you have 20T of data (5 4T drives), you would need another 20T (5 4T drives) to protect it. Adding another disk of data would require also adding another disk of redundancy.

UnRaid's parity protection requires one drive equal to the size to the largest disk in the array for redundancy (parity). So in the example above, unRaid world require 1 4T drive for redundancy rather than 5. And the array could continue to grow with no further redundancy.

Even if BTRFS were bug free and working perfectly, it is not a reason to move an unRaid array to a BTRFS structure.

That makes sense and agreed! Thanks for the clarification!

So my last question (and hoping to put this to rest), is what is the recommendation for cache? Limetech says use a cache pool but all of you guys say just use a single cache drive running XFS since BTRFS is full of bugs and unreliable. I need to get my cache up and running so I can get my dockers and VMs going and want to know how to move forward?

Quote

July 14, 201511 yr

Just thought I would chime in here with my experiences.

I have been using 2x120GB btrfs raid1 cache pool since v6b6 without issue. I have a 20GB docker.img on it which I have never come close to filling, I do not cache user share writes, I have about a dozen dockers using it for appdata, and I typically only have about 40% used space of the 120GB.

I have had no trouble with this configuration. That said, I also have a weekly backup of the cache to the array, and it's never been clear exactly how to fix any problems if they did occur, or indeed how to change this configuration if I wanted to.

Quote

July 14, 201511 yr

Community Expert

So my last question (and hoping to put this to rest), is what is the recommendation for cache? Limetech says use a cache pool but all of you guys say just use a single cache drive running XFS since BTRFS is full of bugs and unreliable.

Actually Limetech simply says that a cache pool is now an option if you want your files to be protected even when on the cache. It does not mandate a cache pool, but if you want a cache pool then you HAVE to use BTRFS format.

Saying that BTRFS is 'full of bugs' is a rather strong statement. What is being said is that BTRFS is new, and the associated utilities are relatively immature. The vast majority of the time BTRFS should be fine, and one assumes that any important files on the cache drive are being backed up elsewhere so if anything does go wrong one can wipe the cache and start again with a clean setup.

I need to get my cache up and running so I can get my dockers and VMs going and want to know how to move forward?

There is no hard-and-fast answer here. If you want data on the cache to be protected (i.e. use a cache pool) then you have to use BTRFS. If this not a requirement then XFS is a tried and trusted solution. Only you know how important it is to you that files on the cache are protected.

Quote

July 14, 201511 yr

I think that once again the concept of RAID redundancy is being confused with the need for backups.

Just because the cache pool is protected against a single drive failure if you use a btrfs RAID1 doesn't mean that you are protected from data loss.

I'm still using btrfs even after my incident with it, I just am more vigilante about keeping backups current.

Saying that BTRFS is 'full of bugs' is a rather strong statement. What is being said is that BTRFS is new, and the associated utilities are relatively immature. The vast majority of the time BTRFS should be fine, and one assumes that any important files on the cache drive are being backed up elsewhere so if anything does go wrong one can wipe the cache and start again with a clean setup.

This, exactly.

I'd bet the vast majority of data loss is caused by users, not file systems or hardware.

Quote

July 14, 201511 yr

Author

Just thought I would chime in here with my experiences.

I have been using 2x120GB btrfs raid1 cache pool since v6b6 without issue. I have a 20GB docker.img on it which I have never come close to filling, I do not cache user share writes, I have about a dozen dockers using it for appdata, and I typically only have about 40% used space of the 120GB.

I have had no trouble with this configuration. That said, I also have a weekly backup of the cache to the array, and it's never been clear exactly how to fix any problems if they did occur, or indeed how to change this configuration if I wanted to.

The point of the cache pool is to add redundancy.. agreed that is not a replacement for a good backup strategy but just like for the array itself, extra redundancy in the cache makes logical sense to me, especially if running critical apps or VMs where data can potentially change very frequently in between backups.

To your point above, I agree that the documentation on how to fix problems in the cache pool if they occur is weak or missing. That is another reason I'm unsure about whether or not to go down this route. Does anyone know what happens if a drive in the cache pool fails? How is it replaced?

Quote

July 14, 201511 yr

I believe if a drive were to fail the volume would continue to operate in degraded mode. Not sure how or if unRaid would let you know. There has to be a way to rebuild the failed disk, but have never seen it documented. I guess the first person that needs to do it will need support from LT.

Quote

July 14, 201511 yr

Author

I believe if a drive were to fail the volume would continue to operate in degraded mode. Not sure how or if unRaid would let you know. There has to be a way to rebuild the failed disk, but have never seen it documented. I guess the first person that needs to do it will need support from LT.

Wow, I would hate to be the first :-)

I'm kind of getting scared away from trying out the cache pool... potential "challenges" with the BTRFS file system coupled with somewhat unclear/untested methods for knowing if a drive fails and then how to even rebuild or replace it...

Can LimeTech chime in here and provide some more detail on how the cache pool works in these situations?

1) how know if a cache pool disk has failed? What happens when a disk fails? Cache pool still operational? Degraded performance?

2) how to replace/rebuild a failed disk?

3) how to replace a working disk with a bigger disk?

4) how to add more disks to the pool?

5) how to stop using a pool and revert back to a single cache disk?

Quote

July 14, 201511 yr

5) how to stop using a pool and revert back to a single cache disk?

I can answer that one, as I just went through trying to help someone recover from this situation.

You can't directly convert a degraded RAID volume back into a single disk. Once a raid volume, always a raid volume. You can mount the single remaining disk in a "rescue" mode, and copy the content elsewhere, then reformat as a single volume and copy the data back.

You can theoretically add more volumes to the RAID1 set, and rebalance the data across all the volumes, and even remove volumes if you have enough space on the remaining drives to keep RAID1 redundancy. You just can't go below 2 disks for a RAID1 set.

All this works great in theory, but nobody is piping up with actual user experiences yet, so nothing to point to for actual examples.

Quote

July 14, 201511 yr

Author

5) how to stop using a pool and revert back to a single cache disk?

All this works great in theory, but nobody is piping up with actual user experiences yet, so nothing to point to for actual examples.

Yeah, that's the part that scares me!! :-)

Quote

July 14, 201511 yr

I believe if a drive were to fail the volume would continue to operate in degraded mode. Not sure how or if unRaid would let you know. There has to be a way to rebuild the failed disk, but have never seen it documented. I guess the first person that needs to do it will need support from LT.

Wow, I would hate to be the first :-)

I'm kind of getting scared away from trying out the cache pool... potential "challenges" with the BTRFS file system coupled with somewhat unclear/untested methods for knowing if a drive fails and then how to even rebuild or replace it...

Can LimeTech chime in here and provide some more detail on how the cache pool works in these situations?

I've been studying the btrfs code to see how it behaves in certain failure situations. I'm not ready to publish a report on this yet but I can give some answers to your questions.

1) how know if a cache pool disk has failed? What happens when a disk fails? Cache pool still operational? Degraded performance?

This is the primary area of study. Short answer: yes it should still be operational and yes it will have degraded performance. Usually what admins do in cases of dying devices to to recognize this via syslog and yank out the device which is failing.

Clearly this is not necessarily the best strategy for unRaid. Here's how the parity-protected array works. First, the only time a device is "disabled" is as a result of non-recoverable write error. Once a device is disabled it is not accessed in any way while the array is Started. When the array is Stopped, then it will be accessed to try and determine it's identity (model/serial) - this for purpose of determining if you replaced the device. Depending on how the device fails, it's conceivable simply trying to read identity could cause "long" periods where it appears webGui is 'hung' - in reality it's not hung, it's simply waiting. In general, linux storage drivers try to be extremely heroic in attempting to recover data. That means it does bus resets, retries, etc, in an effort to read back data. (You can understand why they do this: it's better to return data after a long delay vs. never returning data at all.) This applies to reads as well: depending on nature of the defect, it could take a "long" time to complete a read. At least in unRaid parity-protected array case, if read should fail with unrecoverable error, we will 'rebuild' the data by reading all the other devices and use parity reconstruction. In this case, we then try to write the rebuilt data back to the device - If the device is seriously defective this write will fail, resulting in disabling it.

Back to btrfs: I don't believe there is similar logic in btrfs to "disable" a device. It may turn out that we create a thin md/unraid type layer between btrfs and the device driver to do this.

2) how to replace/rebuild a failed disk?

Yank the bad device, install a new one and execute the 'balance' utility (via the 'cache' page link).

3) how to replace a working disk with a bigger disk?

Yank the small device, install a new one and execute the 'balance' utility (via the 'cache' page link).

4) how to add more disks to the pool?

Install a new one and execute the 'balance' utility (via the 'cache' page link).

5) how to stop using a pool and revert back to a single cache disk?

One way: copy all the data off, disassemble pool, create cache disk, copy data back. Another way would be to remove device, balance, remove next device, balance, etc., until down to one device. Note that this method will not work if there is insufficient space on the last device.

In general, IMHO, btrfs is "mature enough", especially for single devices such as single-device cache or an array device. Even for multi-device Raid-1 it's "pretty solid". Why am I being wishy-washy? Because, for me, btrfs has been just fine, but I recognize there are reports out there of issues. btrfs has been around quite a while now, and there are some very big companies putting lots of resources into making it better and better. Clearly it is/will be the linux file system of choice. Is it now? We would not have given you the option to use it in unRaid if we thought it was just plain terrible. Also, the major reason we diligently keep up with linux kernel releases, btw, is to keep up to date with latest btrfs development. btrfs offers some really great features: COW/data checksum and subvolume snapshots. The latter will enable some really cool features we have planned.

I'll conclude with this (take it or leave it): IT guys tend to be the most conservative lot on the planet. It doesn't take much to get those guys nervous, whether the evidence is anecdotal or not. The bottom line is that everyone has to make their own informed decision about how to store their data.

Quote

July 14, 201511 yr

Another way would be to remove device, balance, remove next device, balance, etc., until down to one device.

Tried that, didn't work. http://lime-technology.com/forum/index.php?topic=41326.msg392229#msg392229

Quote

July 14, 201511 yr

Another way would be to remove device, balance, remove next device, balance, etc., until down to one device.
Tried that, didn't work. http://lime-technology.com/forum/index.php?topic=41326.msg392229#msg392229

The code I'm looking at, 6.0.1, in case where only one device is left in the pool executes

/sbin/btrfs balance start -f -dconvert=single -mconvert=single /mnt/cache

and then

/sbin/btrfs device delete missing /mnt/cache

The fact the post you cite does not have those lines means either that person was using an older version, or there's a bug.

Quote

July 15, 201511 yr

Author

LimeTech,

Thanks for providing the detailed responses to my questions, it is much appreciated. I feel a lot more comfortable going forward with a cache pool now. Redundancy in the cache was the main reason why I decided not to use a cache disk in v5... I can live with the slower write speeds. But now that dockers and VMs are in play, it sounds to me as though the benefits of using a cache pool are even more important.

I'm hoping that in upcoming releases you can provide some built-in tools to proactively monitor and alert on drives in the cache pool that are failing and also to make it as simple as possible to administer and make changes to the pool via the web GUI.

Thank you.

Quote

Turn entire array into BTRFS cache pool?

Featured Replies

Archived

Account

Navigation

Search

Configure browser push notifications

Chrome (Android)

Chrome (Desktop)

Safari (iOS 16.4+)

Safari (macOS)

Edge (Android)

Edge (Desktop)

Firefox (Android)

Firefox (Desktop)