btrfs or xfs for new server?


Recommended Posts

BTRFS is stable enough, but it is much more susceptible to corruption if you have dirty shutdowns.

 

I would recommend XFS unless there is some specific feature of BTRFS you need. And if so, make sure you have a UPS that is appropriately configured.

 

But do some story and make sure you understand what BTRFS is and does. Must people here, i'd say 95%+ are on XFS.

 

I would not loose any sleep worrying about bitrot. I'd be much more concerned about disturbed cables after swapping out ageing a disk, a real problem that causes data loss that you can and should proactively avoid by using hot-swap cages.

  • Thanks 1
  • Upvote 1
Link to comment
38 minutes ago, bjp999 said:

I would not loose any sleep worrying about bitrot. I'd be much more concerned about disturbed cables after swapping out ageing a disk, a real problem that causes data loss that you can and should proactively avoid by using hot-swap cages.

 

And disasters (like fire, flooding, earthquakes and theft) are far more likely to result in total data loss than 'bitrot'!  You should be providing for protection from them before worrying about the almost infinitesimal danger of losing data from bitrot. 

Edited by Frank1940
Link to comment
2 hours ago, tucansam said:

Is btrfs "stable" enough (everyone keeps using the word "experimental") to use on all array member disks?

 

I think so, I use it on all my servers, have no problem recommending it as long as it a stable and UPS protected server.

 

2 hours ago, tucansam said:

Does it have any protection against bit-rot? 

 

Yes, and more important than that for me, it allows you to be sure if any files were corrupt when something unexpected happens, e.g., some read errors on another disk during a rebuild, a disk getting disabled during a file copy operation, etc.

Link to comment

All files are checksummed automatically, if you want to check if everything is OK you can run a scrub, but btrfs checks all files on read and will error on any checksum error, i.e., you're are watching a movie from your server, if the checksum fails there will be and error during playback or copy.

Edited by johnnie.black
  • Upvote 1
Link to comment

IMO main current advantage of ZFS over btrfs is RAIDZ for pools, RAID5/6 on btrfs is still experimental and not ready for production, but using a ZFS pool would negate unRAID main advantages over FreeNAS, like using full capacity from different sized disks, possibility of adding or removing disks from the array, etc, since unRAID uses each disk as a separate filesystem btrfs is as good option, and don't forget that unlike btrfs, ZFS has no filesystem repair tools, if a disk turns unmountable there's nothing you can do, and although rare it happens, you can see that on the FreeNAS forum.

  • Upvote 1
Link to comment
  • 3 months later...

HI All,

 

I figured I'd piggy back on this thread.  I just setup a new XFS Unraid server.  So far it is working fine.  My one concern is that XFS doesn't do any error checking.  Should I be worried about that?  Would it make sense to reformat the drives to BTRFS to have this capability?  I don't have that much data on there that I couldn't start from a fresh disk again.  I'm new to NAS servers and am curious to know what others think.

 

 

Link to comment
2 hours ago, Mlatx said:

Would it make sense to reformat the drives to BTRFS to have this capability?  I don't have that much data on there that I couldn't start from a fresh disk again.  I'm new to NAS servers and am curious to know what others think.

 

There already is a heck of a lot of error checking on any modern hard drive.  If you want to have your head spinning just do a Google search on 'bit rot'.   Personally, I feel that 'bit rot' is more of boogie man than a real life problem.  With the amount of error detection and correction already incorporated into modern hard drives, a drive in constant use will probably fail for other reasons before bit rot  causes a problem.  Now if you are going to load up a hard drive and put it in a closet for twenty-five years, bit rot may be a problem when you go to read it.

 

You now have one opinion...

 

PS---  A couple of years ago, I found some data on raw read errors provided one of the HD manufacturers and did some analysis on it.  I was shocked to find out how often they can occur.  As I recall, you get several of them EVERY time you do a parity check if you have an array with more than three drives in it.  BUT, you never see them BECAUSE the drive error detectio finds them and the error correcting software then corrects them on the fly.

Link to comment
18 minutes ago, Mlatx said:

I read that it can’t coerce the them. However with that information in the metadata, can you correct the error manually? The challenge I suppose would be scrubbing the data.  Is anyone here using Btrfs that can comment?

 

Not sure I understand the question, but I use btrfs on all my unRAID servers, and for array data disks, scrub will detect any checksum mismatch, and since I use the DUP profile for metadata those will be detected and corrected, for data it will be detected and you'll then need to restore the affected file(s) from backup/original source.

Link to comment
3 minutes ago, Mlatx said:

What do you use for the scrub?

 

You need to use the builtin btrfs scrub command.

 

3 minutes ago, Mlatx said:

And how often do you run it?

 

Parity check would catch bit-rot, and in that case I'd scrub all disks, but that hasn't happened yet.

 

I only run it if something happens to a disk that makes me doubt its data integrity, e.g., when a disk redballs during a write operation, so far used it once after a redball and it found a corrupt file.

Link to comment
7 minutes ago, lionelhutz said:

The data would start to corrupt after a few weeks.

 

Like I posted above I won't argue that btrfs is the stablest of filesystems, and there are still issues, but nothing like that, most likely a hardware issue, most users have no problems and don't forget that there are many times more people also using it outside unRAID.

 

Link to comment
1 minute ago, johnnie.black said:

 

Like I posted above I won't argue that btrfs is the stablest of filesystems, and there are still issues, but nothing like that, most likely a hardware issue, most users have no problems and don't forget that there are many times more people also using it outside unRAID.

 

 

You'd think so, but the hardware was perfectly stable using XFS.

 

Link to comment
9 minutes ago, johnnie.black said:

 

If it's not that there must be a reason, because if that was a common problem I would expect to see more people in the forums with the same problem.

 

I'm sure there was a reason but it wasn't obvious and after seeing the issue I'm just saying to be careful about using BTRFS with older hardware. Test it and make sure it's stable before jumping in with both feet. I'm not the only one that has had BRTFS corruption occur for no apparent reason.

 

I'm using it fine now on newer hardware since I've done some upgrades in the last year.

 

 

Edited by lionelhutz
  • Like 1
Link to comment
  • 2 weeks later...

 

 

I am considering complementing my old Synology NAS with a dedicated archive server and this topic deals with the core issue regarding using unRAID : providing a storage suitable to write-once-read-rarely archives with both disk-level checksum scurb and failure parachute with parity.

I would consequently appreciate the expertise from johnnie.black or other mature members regarding my newbie questions :

> Is unRAID able to manage simultaneously on a unique system two arrays, each consisting in n disks + parity ?

> are the disks data displayed on unRAID frontpage logged or loggable on a regular basis, including disk spun-off status ?

> considering btrfs on each data disks, how do btrfs an unRAID manage a checksum error detected on either metadata or data : is the issue autohealed with parity disk ? is the issue logged in details ?

> read/write errors on disks managing SMART, pending and reallocated sectors , with or without TLER settings :

    - whatever the data file system/s is/are, how does unRAID manage the parity computing / checking when facing a read/write error on a data disk ? on  a parity disk ?

     - considering btrfs on each data disk, does unRAID and/or btrfs manage the failed disk and the disks array after a read/write error is detected on one of the data disk ?

I would greatly appreciate your detailed anwsers
 

 

Link to comment
45 minutes ago, vinski said:

> Is unRAID able to manage simultaneously on a unique system two arrays, each consisting in n disks + parity ?

Only one array per server.

 

47 minutes ago, vinski said:

> are the disks data displayed on unRAID frontpage logged or loggable on a regular basis, including disk spun-off status ?

spin downs are logged, don't know what you mean by other data, but temps, space usage, etc are not logged.

 

48 minutes ago, vinski said:

> considering btrfs on each data disks, how do btrfs an unRAID manage a checksum error detected on either metadata or data : is the issue autohealed with parity disk ? is the issue logged in details ?

 

Parity can't heal a checksum error, unless it was genuine bit rot, in that case you could rebuild the disk, metadata can be healed if you use DUP profile.

Any checksum error is logged, you'll also get an I/O error if you're try to read a corrupt file.

 

54 minutes ago, vinski said:

> read/write errors on disks managing SMART, pending and reallocated sectors , with or without TLER settings :

    - whatever the data file system/s is/are, how does unRAID manage the parity computing / checking when facing a read/write error on a data disk ? on  a parity disk ?

- considering btrfs on each data disk, does unRAID and/or btrfs manage the failed disk and the disks array after a read/write error is detected on one of the data disk ?

 

When unRAID encounters a read error it uses all other disks + parity to reconstruct the data and write that sector back to the disk, if the write is successful the read error is logged (and you'll get a warning if notifications are enable), if it fails that disk is disabled and contents emulated.

 

Filesystem used has no impact on how unRAID manages a disk/write failure, behavior is the same as I descried above.

 

 

 

Link to comment

Thanks so much for your detailed answer.

To summerize the drive error management on unRAID with btrfs data drives :

- electrical, mechanical, electronical data/parity drive failures are detected and logged by unRAID :      drive to be replaced

- limited bitrod error transparently corrected by drive firmware ECC

- heavy bitrod error, leading to read error, corrected by unRAID with parity rebuilt, which then try to write the data onto the failed drive

          just logged as error if write completes,    or disk is disalbled and emulated by unRAID until replaced

- btrfs metadata checksum error corrected by btrfs (default) metadata DUP attribute

- btrfs data checksum error (without data DUP) : error indicated and logged during scrub,       data copy is prevented,      backup process necessary

 

One more question regarding unRAID+btrfs :

> is to possible to run a double parity array and to use the second parity drive to fix any failed drive ( ideally remotely until the failed drive is actually replaced) :

   - either use the second parity drive as a replacement data drive

   - keep the remaining parity drive in case of failed parity drive

Is this a manual process ? can it be automated on unRAID ?

Link to comment

Join the conversation

You can post now and register later. If you have an account, sign in now to post with your account.
Note: Your post will require moderator approval before it will be visible.

Guest
Reply to this topic...

×   Pasted as rich text.   Restore formatting

  Only 75 emoji are allowed.

×   Your link has been automatically embedded.   Display as a link instead

×   Your previous content has been restored.   Clear editor

×   You cannot paste images directly. Upload or insert images from URL.