Multiple File System Support: Feature Highlights


jonp

Recommended Posts

There are two key subjects I want to address with respect to Multiple File System Support in Beta 7.

 

Benefits of BTRFS Support for both Cache Pool and Array Devices

Some may be wondering what the benefits are of BTRFS now in beta7 as they compared to beta6.  Here are just a few highlights:

 

1)  Intelligent Use of SSDs for read operations in a RAID1 pool

Scenario:  You have two drives in a a RAID1 pool, one HDD, and one SSD.  When write operations occur, the bottleneck for performance will be the HDD.  However, when read operations occur btrfs will prefer the use of the SSD every time.  This has its benefits.

 

2)  Allowing for mixing / matching different drives

This breaks the two fundamental rules of a RAID 1 array but thanks to how BTRFS spreads data out in chunks across multiple drives is really a benefit and a curse all in one and why proper balancing is important along with having good understanding for your fault tolerance.  E.g. a three drive pool cannot suffer more than a single drive loss before losing data.

 

3)  BTRFS is the ONLY way right now to achieve data protection for cache-data and allow multiple disks to operate in a pool

All that mixing/matching is worthless if you couldn't have multiple drives act as "one" logical mount point (/mnt/cache) to begin with.  BTRFS RAID1 pools aren't perfect in unRAID yet, but they can work and be functional and significantly improve the overall reliability of your system, which we feel is even more important when running more and more applications.

 

4)  Copy on Write and Snapshot Support

These two big features really speak to the sophistication of this file system.  Virtual Machines wishing to use the QCOW2 image format for snapshotting, encryption, and compression will need to have their storage be formatted with BTRFS.

 

5)  Beta 7 now allows users without cache devices to use Docker on a BTRFS-formatted array device

Through now, users have had to either use a non-array partition or reformat their cache drive to use BTRFS in beta6.  In beta 7, you can now use BTRFS on an array device, even in the free version.  This opens new potential users to docker that previously couldn't use the system.

 

6)  BTRFS native Trim support

As mentioned before, BTRFS is SSD aware and as such, it also has native support for TRIM operations which will substantially reduce the amount of write IO to your SSD devices in unRAID.  We do not currently recommend using an SSD on an array device in unRAID, but in a cache pool, it is quite advantageous.

 

In addition to the above, eventually there is potential for BTRFS to implement some really really cool stuff for file-based RAID instead of the current method.  This will be like Tom's user-shares but built into BTRFS and operating at a mixture of file/block level for potentially great performance.  What would this do?  Well in a pool, you could configure it to make sure that each FILE written to the pool is completely stored (in total) on however many devices you want.  So if for an individual file you wanted it to be triple redundant, you could do that.  If for another file, you only wanted it to be able to suffer two disk failures, you could do that too.  It would be intelligent enough to manipulate the RAID method on a file-by-file basis and again, potentially with great performance.

 

Notice the two bolded words above.  These are "futures" that are in discussion on the BTRFS project (you can research yourself to find the references).  They are not built in to be taken advantage of in unRAID right now.  But the potential value-prop for this if/when it does materialize would be fairly massive.

 

Just thought it'd be good to highlight some of this for those that don't understand.

 

reiserfs vs. xfs for array devices

I need to do more and more of this testing, but generally speaking, I think it's fairly clear that xfs vs. reiserfs is more about a making a chess move now that will play out in the longer term to our advantage.  Quite simply:  we believe that if we fast-forward the clock, sooner or later there will be a point in time where xfs is going to have advantages for users over reiserfs.  Call it a hunch, an educated guess, a prediction...whatever.  We just really think resierfs is on it's way out the door.  The bottom line is that for array devices, I would suggest migrating away from reiserfs as users find the opportunity to do so.  Its not a rush...  Its not going to break...  Its been very stable...  That said, think chess moves...  In addition ALL cache pool devices should be BTRFS for now in my opinion unless you're never planning to expand past a single unprotected cache disk.  If you don't have a secondary cache device yet, you can straggle along, but if you want to use a cache pool, you will need two btrfs disk devices assigned.

 

Whether you go XFS or BTRFS for your array devices then is a personal choice, but here's the difference in my opinion:

 

XFS = mature and suitable replacement to reiserfs with longer-term supportability

BTRFS = more sophisticated and intelligent storage file system, but still undergoing lots of development.

 

If BTRFS can fulfill on it's promises, it will be the most intelligent file system available.  Quite simply, it would do for filesystems what Docker is doing for applications.  It will universally optimize storage use based on ease for economical growth choices in supporting capacity and performance as well as support different levels of protection for individual files.  But this also sounds like the search for the holy grail...

 

My array at home has a mix of btrfs, xfs, and reiserfs array devices with media content spread across all of them in a user share.  They work in concert just fine.  I need to do more and more of this testing, but generally speaking, I think it's fairly clear that xfs vs. reiserfs is more about a making a chess move now that will play out in the longer term to our advantage.  Quite simply:  we believe that if we fast-forward the clock, sooner or later there will be a point in time where xfs is going to have advantages for users over reiserfs.  Call it a hunch, an educated guess, a prediction...whatever.  We just really think resierfs is on it's way out the door.

 

As for XFS vs. btrfs for array devices, I personally haven't noticed a difference in my use of them there.  The biggest benefits of BTRFS is when the application you have for it takes advantage of it's capabilities as a filesystem.  Docker is the best example of that in how it uses snapshots and copy on write features to make more efficient use of disk capacity and I/O.  VMs with QCOW/2 image formats are also taking advantage of BTRFS capabilities.  That said, I don't find it useful to run VMs on the array on spinning disks.  Just a personal preference as someone who is constantly testing different VMs all the time.  It's not that they don't work, it's just that performance will always be faster in a cache pool and especially with SSDs like I'm using for my cache devices.  Point is, even with just a pair of 256GB SSDs, I can fit quite a lot of applications for both Docker and VMs thanks to BTRFS.

Link to comment
  • Replies 51
  • Created
  • Last Reply

Top Posters In This Topic

is there a way to tell if unraid actually sees an SSD like it should?

 

Cache Device Identification Temp. Size Used Free Reads Writes Errors FS View

Spin DownCache SSD2SC240G1SA754D117-820_PNY28140000551160327 (sdf) 234431032 212 °F 240 GB 5.41 GB 235 GB 24,526 33,930 0 btrfs Browse /mnt/cache

Link to comment

is there a way to tell if unraid actually sees an SSD like it should?

 

Cache Device Identification Temp. Size Used Free Reads Writes Errors FS View

Spin DownCache SSD2SC240G1SA754D117-820_PNY28140000551160327 (sdf) 234431032 212 °F 240 GB 5.41 GB 235 GB 24,526 33,930 0 btrfs Browse /mnt/cache

 

What do you mean "like it should"?  I think there are some scenarios where SSDs aren't reporting data correctly (e.g. your extremely hot hard drive 8) )

 

Mine report fine:

 

Cache Device	Identification	Temp.	Size	Used	Free	Reads	Writes	Errors	FS	View
Spin DownCache	OCZ-AGILITY2_OCZ-30D5LE0QKL37GO26 (sdb) 112337032	86 °F	115 GB	112 GB	123 GB	8792	43,459	0	btrfs	Browse /mnt/cache
Spin DownCache2	OCZ-VERTEX3_OCZ-83X3BLPQ1M7YJ2YG (sde) 117220792	86 °F	120 GB	-	-	3360	43,876	0	btrfs	

Link to comment

With btrfs, if you a assign a SSD device to the cache pool and start the array, it will perform a full device trim.  So Yes.  Btrfs seems to work well at detecting SSDs and using trim.

 

I have 6.06 with. Btrfs cache drive.  If I install 6.07 will trim run on that drive automagically when needed of scheduled, or do I have to manually invoke it ?

Link to comment

1)  Intelligent Use of SSDs for read operations in a RAID1 pool

Scenario:  You have two drives in a a RAID1 pool, one HDD, and one SSD.  When write operations occur, the bottleneck for performance will be the HDD.  However, when read operations occur btrfs will prefer the use of the SSD every time.  This has its benefits.

 

I didn't really get why you were focusing on RAID 1 - but reading between the lines I get the feeling you are expecting people to use an SSD as a cache/VMs drive and look to placing that in a RAID 1 with a HDD as redundancy for the cache/VMs/Containers ?

 

Is the slow down on write of a HDD worth it? Mind, it would probably still be faster than the network.

 

In addition to the above, eventually there is potential for BTRFS to implement some really really cool stuff for file-based RAID instead of the current method.  This will be like Tom's user-shares but built into BTRFS and operating at a mixture of file/block level for potentially great performance.  What would this do?  Well in a pool, you could configure it to make sure that each FILE written to the pool is completely stored (in total) on however many devices you want.  So if for an individual file you wanted it to be triple redundant, you could do that.  If for another file, you only wanted it to be able to suffer two disk failures, you could do that too.  It would be intelligent enough to manipulate the RAID method on a file-by-file basis and again, potentially with great performance.

 

Which is nice, but one of the advantage of UNRAID IMHO is you can set it so all the file for one entity are stored on the same drive - so if the worst happens you can get 'something' back. Block level seems to potentially bring us back to RAID 5-land of losing everything in worst case.

 

Plus an advantage is being able to take the drive out of the array, plug it into a PC and recover something. Taking a single disk out of the middle of a BTRFS or XFS setup seems much more problematic.

 

I need to do more and more of this testing, but generally speaking, I think it's fairly clear that xfs vs. reiserfs is more about a making a chess move now that will play out in the longer term to our advantage.  Quite simply:  we believe that if we fast-forward the clock, sooner or later there will be a point in time where xfs is going to have advantages for users over reiserfs.  Call it a hunch, an educated guess, a prediction...whatever.  We just really think resierfs is on it's way out the door. 

 

My thoughts exactly, hence my questions/interest earlier about a route away from ReiserFS. However, as I also said those months ago, I think you need to seriously work on conversion of filesystems in place. Not all of us are going to get a new 4 TB disk just for (slowly) copying across and converting to a new FS. It creates friction, lots of friction.

 

Approaches such as FSTransform seem to cover the ground of non-destructive conversion in place (http://www.linux-magazine.com/Online/Features/Converting-Filesystems-with-Fstransform ) with the proviso of necessary free space. As I said before, there was someone looking at bolting BTRFS capability into such a tool, and a cursory glance suggested a measure of compatibility in structure between ReiserFS and BTRFS that would seem to make it possible.

 

Link to comment

1)  Intelligent Use of SSDs for read operations in a RAID1 pool

Scenario:  You have two drives in a a RAID1 pool, one HDD, and one SSD.  When write operations occur, the bottleneck for performance will be the HDD.  However, when read operations occur btrfs will prefer the use of the SSD every time.  This has its benefits.

 

I didn't really get why you were focusing on RAID 1 - but reading between the lines I get the feeling you are expecting people to use an SSD as a cache/VMs drive and look to placing that in a RAID 1 with a HDD as redundancy for the cache/VMs/Containers ?

 

Is the slow down on write of a HDD worth it? Mind, it would probably still be faster than the network.

 

In addition to the above, eventually there is potential for BTRFS to implement some really really cool stuff for file-based RAID instead of the current method.  This will be like Tom's user-shares but built into BTRFS and operating at a mixture of file/block level for potentially great performance.  What would this do?  Well in a pool, you could configure it to make sure that each FILE written to the pool is completely stored (in total) on however many devices you want.  So if for an individual file you wanted it to be triple redundant, you could do that.  If for another file, you only wanted it to be able to suffer two disk failures, you could do that too.  It would be intelligent enough to manipulate the RAID method on a file-by-file basis and again, potentially with great performance.

 

Which is nice, but one of the advantage of UNRAID IMHO is you can set it so all the file for one entity are stored on the same drive - so if the worst happens you can get 'something' back. Block level seems to potentially bring us back to RAID 5-land of losing everything in worst case.

 

Plus an advantage is being able to take the drive out of the array, plug it into a PC and recover something. Taking a single disk out of the middle of a BTRFS or XFS setup seems much more problematic.

 

I need to do more and more of this testing, but generally speaking, I think it's fairly clear that xfs vs. reiserfs is more about a making a chess move now that will play out in the longer term to our advantage.  Quite simply:  we believe that if we fast-forward the clock, sooner or later there will be a point in time where xfs is going to have advantages for users over reiserfs.  Call it a hunch, an educated guess, a prediction...whatever.  We just really think resierfs is on it's way out the door. 

 

My thoughts exactly, hence my questions/interest earlier about a route away from ReiserFS. However, as I also said those months ago, I think you need to seriously work on conversion of filesystems in place. Not all of us are going to get a new 4 TB disk just for (slowly) copying across and converting to a new FS. It creates friction, lots of friction.

 

Approaches such as FSTransform seem to cover the ground of non-destructive conversion in place (http://www.linux-magazine.com/Online/Features/Converting-Filesystems-with-Fstransform ) with the proviso of necessary free space. As I said before, there was someone looking at bolting BTRFS capability into such a tool, and a cursory glance suggested a measure of compatibility in structure between ReiserFS and BTRFS that would seem to make it possible.

Supporting conversions is something we are open to investigating but not critical for short term implementation.

Link to comment

While it will take longer to format and move files, it's much safer if done carefully to a new disk.

 

Unless you have a backup of every single disk on your array or are OK with the potential to loose it,

 

I would suggest an md5sum of the whole disk, then rsyncing one disk to the other.

md5sum -c on the new filesystem and you are good to use that for the next disk hop.

 

 

 

Link to comment

Benefits of BTRFS Support for both Cache Pool and Array Devices

Some may be wondering what the benefits are of BTRFS now in beta7 as they compared to beta6.  Here are just a few highlights:

 

1)  Intelligent Use of SSDs for read operations in a RAID1 pool

Scenario:  You have two drives in a a RAID1 pool, one HDD, and one SSD.  When write operations occur, the bottleneck for performance will be the HDD.  However, when read operations occur btrfs will prefer the use of the SSD every time.  This has its benefits.

 

Let's say I have currently : 

1 x SSD 500GB (Samsung Evo 800) as a Cache drive

1 x WD Green 3TB as a BTRFS drive for Docker use

 

do you recommend that process :

 

Add a 2nd SSD 500GB (Samsung Evo 800), then configure my 2 SSD as a cache Pool (in BTRFS) and then use the 3TB of WD Green to start converting my Array of 10 x WD Red 3TB currently configured as ReiserFS to XFS (not BTRFS as i'll run Dockers & VM on the new Cache Pool in BTRFS ?) ??

 

 

Link to comment

Benefits of BTRFS Support for both Cache Pool and Array Devices

Some may be wondering what the benefits are of BTRFS now in beta7 as they compared to beta6.  Here are just a few highlights:

 

1)  Intelligent Use of SSDs for read operations in a RAID1 pool

Scenario:  You have two drives in a a RAID1 pool, one HDD, and one SSD.  When write operations occur, the bottleneck for performance will be the HDD.  However, when read operations occur btrfs will prefer the use of the SSD every time.  This has its benefits.

 

Let's say I have currently : 

1 x SSD 500GB (Samsung Evo 800) as a Cache drive

1 x WD Green 3TB as a BTRFS drive for Docker use

 

do you recommend that process :

 

Add a 2nd SSD 500GB (Samsung Evo 800), then configure my 2 SSD as a cache Pool (in BTRFS) and then use the 3TB of WD Green to start converting my Array of 10 x WD Red 3TB currently configured as ReiserFS to XFS (not BTRFS as i'll run Dockers & VM on the new Cache Pool in BTRFS ?) ??

I'm in the same position, would appreciate any input :-)

Link to comment

I started with 1 extra 4TB drive and formatted it as XFS, and then shifted all the contents from another drive over, and then format the newly cleared drive and repeated. 

 

I have one unRaid server about 80% converted, and the other about 25%.  I have been holding there to see how the betas go.  I am pretty sure I want to stick with XFS for all the array drives, but just incase something sways me over to BTRFS, I don't want to have to redo to much work...

Link to comment

just incase something sways me over to BTRFS...

 

Does the ability to validate your files with checksum interest you?

Hows about automatic compression/decompression?

 

Having the ability to verify your data in case of pending sectors or failed reads might be a useful feature for an archive server.

Link to comment

Those are the main reasons I am on the fence..

 

Things like:

 

Tower logger: WARNING! - Btrfs v3.14.2 IS EXPERIMENTAL

 

and all the articles that suggest it for dev/test systems, but not for production systems worry me a bit..

 

I did actually convert a few array drives to btrfs to test and if I decide to go back, at least I have a couple in that format.

Link to comment

Ok...

 

I had couple (3 or 4) of my WD Red 3TB that was in my array, but totally empty.  I stopped the array, then change the format to XFS for my last one (Disk 10), then I started the array.  It show up as XFS, but Unformatted.  Is there a WebGUI command to have the disk formatted by Unraid ??

Link to comment

Damn!

 

You'r right!  I was looking for a Format button in the Disk detail page... Thanks

 

Now have 3 x 3TB of XFS FS drive in the array.  What would be the easiest/fastest data move ??  take the content of 1st 3TB and rsync it to new drive via SSH ??  Any recommendation for switchs to use or other commands ?

Link to comment

Ok...

 

I had couple (3 or 4) of my WD Red 3TB that was in my array, but totally empty.  I stopped the array, then change the format to XFS for my last one (Disk 10), then I started the array.  It show up as XFS, but Unformatted.  Is there a WebGUI command to have the disk formatted by Unraid ??

When you say "in my array, but totally empty" do you mean they had already been precleared but had not yet been formatted to ReiserFS? Just trying to get a clearer idea of how this works. I normally don't think of a drive that hasn't been formatted as "in the array" yet.

 

On the other hand, if it did already have an empty ReiserFS on it, I would think unRAID would want to clear it first.

Link to comment

I converted a drive from rfs to xfs last night.  Stopped the array, change the FS type to xfs, restarted the array, it came up as unformattted. Stopped the array, formatted the drive and away we went.

 

Things seem fine.

Now that I think about it more that makes sense, since unRAID must update parity when it formats a drive, and whatever changes are made to the disk when you change file systems, unRAID should keep the parity in sync.

 

So, parity was in sync after it was originally formatted to ReiserFS, and parity was updated when it reformatted so it is still in sync. If you have an empty file system then changing to another empty file system should be OK.

 

Of course, this means that you could also change a drive that had files on it to another file system this way too, and you would wind up with an empty file system.

Link to comment

Damn!

 

You'r right!  I was looking for a Format button in the Disk detail page... Thanks

 

Now have 3 x 3TB of XFS FS drive in the array.  What would be the easiest/fastest data move ??  take the content of 1st 3TB and rsync it to new drive via SSH ??  Any recommendation for switchs to use or other commands ?

 

I have been using rsync with:

 

rsync -av --progress --remove-from-source /mnt/diskX/ /mnt/diskY/

 

This allows you to restart if it gets interrupted for any reason and removes the source file once it is successfully copied. 

 

Just a warning, I wouldn't use the shares when coping since it was found that under a certain case you can loose data if you copy from a drive to the share can truncate the data.  Also best to run from screen so you wont have to worry about the session getting terminated..

Link to comment

I converted a drive from rfs to xfs last night.  Stopped the array, change the FS type to xfs, restarted the array, it came up as unformattted. Stopped the array, formatted the drive and away we went.

 

Things seem fine.

Now that I think about it more that makes sense, since unRAID must update parity when it formats a drive, and whatever changes are made to the disk when you change file systems, unRAID should keep the parity in sync.

 

So, parity was in sync after it was originally formatted to ReiserFS, and parity was updated when it reformatted so it is still in sync. If you have an empty file system then changing to another empty file system should be OK.

 

Of course, this means that you could also change a drive that had files on it to another file system this way too, and you would wind up with an empty file system.

All this was sort of implied by Tom's description of changing file systems in the release announcement. Sometimes I have to understand how something works before I can understand what it does.
Link to comment

I converted a drive from rfs to xfs last night.  Stopped the array, change the FS type to xfs, restarted the array, it came up as unformattted. Stopped the array, formatted the drive and away we went.

 

Things seem fine.

Now that I think about it more that makes sense, since unRAID must update parity when it formats a drive, and whatever changes are made to the disk when you change file systems, unRAID should keep the parity in sync.

 

So, parity was in sync after it was originally formatted to ReiserFS, and parity was updated when it reformatted so it is still in sync. If you have an empty file system then changing to another empty file system should be OK.

 

Of course, this means that you could also change a drive that had files on it to another file system this way too, and you would wind up with an empty file system.

That has been my experience, as long as you switch the filesystem via unraid and format, the parity all stays in sync.  I have migrated data off/reformated to BTRFS, wrote some data, migrated that data to another drive and then switch it to XFS, and didn't invalidate parity at all and from the user share, the data looked like nothing ever happened.

 

I was even watching a movie during my data migration, just hit pause when I saw it getting close to the end of the rsync, and once completed hit play, and it resumed as if nothing happened...

Link to comment

Join the conversation

You can post now and register later. If you have an account, sign in now to post with your account.
Note: Your post will require moderator approval before it will be visible.

Guest
Reply to this topic...

×   Pasted as rich text.   Restore formatting

  Only 75 emoji are allowed.

×   Your link has been automatically embedded.   Display as a link instead

×   Your previous content has been restored.   Clear editor

×   You cannot paste images directly. Upload or insert images from URL.