btrfs or xfs for new server?

tucansam · July 22, 2017

Is btrfs "stable" enough (everyone keeps using the word "experimental") to use on all array member disks? Does it have any protection against bit-rot? New server, wondering what filesystem to choose. Are the benefits vs xfs worth considering?

Thanks.

SSD · July 22, 2017

BTRFS is stable enough, but it is much more susceptible to corruption if you have dirty shutdowns.

I would recommend XFS unless there is some specific feature of BTRFS you need. And if so, make sure you have a UPS that is appropriately configured.

But do some story and make sure you understand what BTRFS is and does. Must people here, i'd say 95%+ are on XFS.

I would not loose any sleep worrying about bitrot. I'd be much more concerned about disturbed cables after swapping out ageing a disk, a real problem that causes data loss that you can and should proactively avoid by using hot-swap cages.

Frank1940 · July 22, 2017

38 minutes ago, bjp999 said:

I would not loose any sleep worrying about bitrot. I'd be much more concerned about disturbed cables after swapping out ageing a disk, a real problem that causes data loss that you can and should proactively avoid by using hot-swap cages.

And disasters (like fire, flooding, earthquakes and theft) are far more likely to result in total data loss than 'bitrot'! You should be providing for protection from them before worrying about the almost infinitesimal danger of losing data from bitrot.

Edited July 22, 2017 by Frank1940

JorgeB · July 22, 2017

2 hours ago, tucansam said:

Is btrfs "stable" enough (everyone keeps using the word "experimental") to use on all array member disks?

I think so, I use it on all my servers, have no problem recommending it as long as it a stable and UPS protected server.

2 hours ago, tucansam said:

Does it have any protection against bit-rot?

Yes, and more important than that for me, it allows you to be sure if any files were corrupt when something unexpected happens, e.g., some read errors on another disk during a rebuild, a disk getting disabled during a file copy operation, etc.

tucansam · July 22, 2017

Thanks Johnnie. How does that work? Is it automatic, or are there utilities to download and run?

JorgeB · July 22, 2017

All files are checksummed automatically, if you want to check if everything is OK you can run a scrub, but btrfs checks all files on read and will error on any checksum error, i.e., you're are watching a movie from your server, if the checksum fails there will be and error during playback or copy.

Edited July 22, 2017 by johnnie.black

Zonediver · July 22, 2017

So the best way would be ZFS - but does unraid support it?

JorgeB · July 22, 2017

IMO main current advantage of ZFS over btrfs is RAIDZ for pools, RAID5/6 on btrfs is still experimental and not ready for production, but using a ZFS pool would negate unRAID main advantages over FreeNAS, like using full capacity from different sized disks, possibility of adding or removing disks from the array, etc, since unRAID uses each disk as a separate filesystem btrfs is as good option, and don't forget that unlike btrfs, ZFS has no filesystem repair tools, if a disk turns unmountable there's nothing you can do, and although rare it happens, you can see that on the FreeNAS forum.

Mlatx · October 23, 2017

HI All,

I figured I'd piggy back on this thread. I just setup a new XFS Unraid server. So far it is working fine. My one concern is that XFS doesn't do any error checking. Should I be worried about that? Would it make sense to reformat the drives to BTRFS to have this capability? I don't have that much data on there that I couldn't start from a fresh disk again. I'm new to NAS servers and am curious to know what others think.

Frank1940 · October 24, 2017

2 hours ago, Mlatx said:

Would it make sense to reformat the drives to BTRFS to have this capability? I don't have that much data on there that I couldn't start from a fresh disk again. I'm new to NAS servers and am curious to know what others think.

There already is a heck of a lot of error checking on any modern hard drive. If you want to have your head spinning just do a Google search on 'bit rot'. Personally, I feel that 'bit rot' is more of boogie man than a real life problem. With the amount of error detection and correction already incorporated into modern hard drives, a drive in constant use will probably fail for other reasons before bit rot causes a problem. Now if you are going to load up a hard drive and put it in a closet for twenty-five years, bit rot may be a problem when you go to read it.

You now have one opinion...

PS--- A couple of years ago, I found some data on raw read errors provided one of the HD manufacturers and did some analysis on it. I was shocked to find out how often they can occur. As I recall, you get several of them EVERY time you do a parity check if you have an array with more than three drives in it. BUT, you never see them BECAUSE the drive error detectio finds them and the error correcting software then corrects them on the fly.

lionelhutz · October 24, 2017

Just to make sure it's clear. A single BTRFS disk file system can detect data errors but it can't correct them. I've seen it confused that the BTRFS check summing will provide error correction, but it's only for the metadata on a single disk.

Mlatx · October 24, 2017

I read that it can’t correct errors. However with that information in the metadata, can you correct the error manually? The challenge I suppose would be scrubbing the data. Is anyone here using Btrfs that can comment?

Edited October 24, 2017 by Mlatx

JorgeB · October 24, 2017

18 minutes ago, Mlatx said:

I read that it can’t coerce the them. However with that information in the metadata, can you correct the error manually? The challenge I suppose would be scrubbing the data. Is anyone here using Btrfs that can comment?

Not sure I understand the question, but I use btrfs on all my unRAID servers, and for array data disks, scrub will detect any checksum mismatch, and since I use the DUP profile for metadata those will be detected and corrected, for data it will be detected and you'll then need to restore the affected file(s) from backup/original source.

Mlatx · October 24, 2017

Sorry phone botched the words. I corrected above. What do you use for the scrub? And how often do you run it?

JorgeB · October 24, 2017

3 minutes ago, Mlatx said:

What do you use for the scrub?

You need to use the builtin btrfs scrub command.

3 minutes ago, Mlatx said:

And how often do you run it?

Parity check would catch bit-rot, and in that case I'd scrub all disks, but that hasn't happened yet.

I only run it if something happens to a disk that makes me doubt its data integrity, e.g., when a disk redballs during a write operation, so far used it once after a redball and it found a corrupt file.

Mlatx · October 24, 2017

Thanks for your help. It seems like there isn’t much of a downside to use Btrfs.

JorgeB · October 24, 2017

8 minutes ago, Mlatx said:

It seems like there isn’t much of a downside to use Btrfs.

It's still not as stable as xfs but it's stable enough for me, especially for single disk use, since on unRAID each data disk is a separate filesystem, stable server (ideally with ECC) and UPS recommended.

lionelhutz · October 25, 2017

I had really bad luck trying BRTFS on my cache drive with a little older hardware like say 6-8 year old parts. The data would start to corrupt after a few weeks. I would recommend you test the stability by setting up a single disk and doing a few days of fairly intense reading and writing to it.

JorgeB · October 25, 2017

7 minutes ago, lionelhutz said:

The data would start to corrupt after a few weeks.

Like I posted above I won't argue that btrfs is the stablest of filesystems, and there are still issues, but nothing like that, most likely a hardware issue, most users have no problems and don't forget that there are many times more people also using it outside unRAID.

lionelhutz · October 25, 2017

1 minute ago, johnnie.black said:

Like I posted above I won't argue that btrfs is the stablest of filesystems, and there are still issues, but nothing like that, most likely a hardware issue, most users have no problems and don't forget that there are many times more people also using it outside unRAID.

You'd think so, but the hardware was perfectly stable using XFS.

JorgeB · October 25, 2017

3 minutes ago, lionelhutz said:

You'd think so, but the hardware was perfectly stable using XFS.

If it's not that there must be a reason, because if that was a common problem I would expect to see more people in the forums with the same problem.

lionelhutz · October 25, 2017

9 minutes ago, johnnie.black said:

If it's not that there must be a reason, because if that was a common problem I would expect to see more people in the forums with the same problem.

I'm sure there was a reason but it wasn't obvious and after seeing the issue I'm just saying to be careful about using BTRFS with older hardware. Test it and make sure it's stable before jumping in with both feet. I'm not the only one that has had BRTFS corruption occur for no apparent reason.

I'm using it fine now on newer hardware since I've done some upgrades in the last year.

Edited October 25, 2017 by lionelhutz

vinski · November 3, 2017

I am considering complementing my old Synology NAS with a dedicated archive server and this topic deals with the core issue regarding using unRAID : providing a storage suitable to write-once-read-rarely archives with both disk-level checksum scurb and failure parachute with parity.

I would consequently appreciate the expertise from johnnie.black or other mature members regarding my newbie questions :

> Is unRAID able to manage simultaneously on a unique system two arrays, each consisting in n disks + parity ?

> are the disks data displayed on unRAID frontpage logged or loggable on a regular basis, including disk spun-off status ?

> considering btrfs on each data disks, how do btrfs an unRAID manage a checksum error detected on either metadata or data : is the issue autohealed with parity disk ? is the issue logged in details ?

> read/write errors on disks managing SMART, pending and reallocated sectors , with or without TLER settings :

- whatever the data file system/s is/are, how does unRAID manage the parity computing / checking when facing a read/write error on a data disk ? on a parity disk ?

- considering btrfs on each data disk, does unRAID and/or btrfs manage the failed disk and the disks array after a read/write error is detected on one of the data disk ?

I would greatly appreciate your detailed anwsers

JorgeB · November 3, 2017

45 minutes ago, vinski said:

> Is unRAID able to manage simultaneously on a unique system two arrays, each consisting in n disks + parity ?

Only one array per server.

47 minutes ago, vinski said:

> are the disks data displayed on unRAID frontpage logged or loggable on a regular basis, including disk spun-off status ?

spin downs are logged, don't know what you mean by other data, but temps, space usage, etc are not logged.

48 minutes ago, vinski said:

> considering btrfs on each data disks, how do btrfs an unRAID manage a checksum error detected on either metadata or data : is the issue autohealed with parity disk ? is the issue logged in details ?

Parity can't heal a checksum error, unless it was genuine bit rot, in that case you could rebuild the disk, metadata can be healed if you use DUP profile.

Any checksum error is logged, you'll also get an I/O error if you're try to read a corrupt file.

54 minutes ago, vinski said:

> read/write errors on disks managing SMART, pending and reallocated sectors , with or without TLER settings :

- whatever the data file system/s is/are, how does unRAID manage the parity computing / checking when facing a read/write error on a data disk ? on a parity disk ?

- considering btrfs on each data disk, does unRAID and/or btrfs manage the failed disk and the disks array after a read/write error is detected on one of the data disk ?

When unRAID encounters a read error it uses all other disks + parity to reconstruct the data and write that sector back to the disk, if the write is successful the read error is logged (and you'll get a warning if notifications are enable), if it fails that disk is disabled and contents emulated.

Filesystem used has no impact on how unRAID manages a disk/write failure, behavior is the same as I descried above.

vinski · November 3, 2017

Thanks so much for your detailed answer.

To summerize the drive error management on unRAID with btrfs data drives :

- electrical, mechanical, electronical data/parity drive failures are detected and logged by unRAID : drive to be replaced

- limited bitrod error transparently corrected by drive firmware ECC

- heavy bitrod error, leading to read error, corrected by unRAID with parity rebuilt, which then try to write the data onto the failed drive

just logged as error if write completes, or disk is disalbled and emulated by unRAID until replaced

- btrfs metadata checksum error corrected by btrfs (default) metadata DUP attribute

- btrfs data checksum error (without data DUP) : error indicated and logged during scrub, data copy is prevented, backup process necessary

One more question regarding unRAID+btrfs :

> is to possible to run a double parity array and to use the second parity drive to fix any failed drive ( ideally remotely until the failed drive is actually replaced) :

- either use the second parity drive as a replacement data drive

- keep the remaining parity drive in case of failed parity drive

Is this a manual process ? can it be automated on unRAID ?

btrfs or xfs for new server?

Recommended Posts

Link to comment

Link to comment

Link to comment

Link to comment

Link to comment

Link to comment

Link to comment

Link to comment

Link to comment

Link to comment

Link to comment

Link to comment

Link to comment

Link to comment

Link to comment

Link to comment

Link to comment

Link to comment

Link to comment

Link to comment

Link to comment

Link to comment

Link to comment

Link to comment

Link to comment

Join the conversation