BTRFS Scrub Discussion


Recommended Posts

Can scrub be ran while the array is online?

 

The thing that is lacking from btrfs right now is any kind of checkdsk if the data does go corrupt, but it does have the benefit of detecting bitrot.

 

'btrfs scrub' is sufficient for must use cases.  The reason why there is no checkdsk is that there is quite a bit of debate as to whether it's needed.

Link to comment

Can scrub be ran while the array is online?

 

It can.  Not sure of the implications on parity if errors are found and corrected during the scrub.  Probably best to rescan parity in that case.  The scrub will correct errors in metadata and the actual data.  It's also run periodically during normal operation.  Or so I've read.  Not sure how that works.

Link to comment

Can scrub be ran while the array is online?

 

It can.  Not sure of the implications on parity if errors are found and corrected during the scrub.  Probably best to rescan parity in that case.  The scrub will correct errors in metadata and the actual data.  It's also run periodically during normal operation.  Or so I've read.  Not sure how that works.

 

From what I read, scrub can detect checksum errors on a BTRFS device.

Scrub can correct errors on devices that are redundant. I.E. snapshot or raid.

So for the cache it would help, for array devices I'm doubtful if it can help other then detect the problem.

 

Interesting read and example:

https://blogs.oracle.com/wim/entry/btrfs_scrub_go_fix_corruptions

Link to comment

Scrub can correct errors on devices that are redundant. I.E. snapshot or raid.

 

So it may not be all that useful for unRAID unless you are using snapshots.  Which I haven't tried yet.  BTRFS arrays within the context of unRAID may not appropriate.  Safe, but perhaps overkill.

 

Just read more about scrubs and snapshots... doesn't do anything against bit rot.  'Snapshots work by use of btrfs's copy-on-write behaviour. A snapshot and the original it was taken from initially share all of the same data blocks. If that data is damaged in some way (cosmic rays, bad disk sector, accident with dd to the disk), then the snapshot and the original will both be damaged.'

 

Hmmmm.... I have now set up daily snap shots for each drive in my array.  Will be interesting to see if it causes any havoc with unRAID.  I am using the following script and have configured it to keep up to 5 snapshots.

 

http://pastebin.com/U1qgcPu6

 

What I need is a way to send an error email if a btrfs scrub fails.

Link to comment

Scrub can correct errors on devices that are redundant. I.E. snapshot or raid.

 

So it may not be all that useful for unRAID unless you are using snapshots.  Which I haven't tried yet.  BTRFS arrays within the context of unRAID may not appropriate.  Safe, but perhaps overkill.

 

 

Hmmmm.... I have now set up daily snap shots for each drive in my array.  Will be interesting to see if it causes any havoc with unRAID.  I am using the following script and have configured it to keep up to 5 snapshots.

 

http://pastebin.com/U1qgcPu6

 

What I need is a way to send an error email if a btrfs scrub fails.

 

 

Is this for an unRAID array drive or for cache array?

Link to comment

Has anyone run scrub yet? How long did it take and for how big hard drive. Some people on the internet have said 30 hours for 1tb..

 

It's based on how fast you can read the hard drive. The scrub itself does not do all that much.

The fastest I've seen so far on my array is 190MB/s for the 7200 RPM seagate 3tb.  160-175MB/s for the seagate 5900 4TB.

I suppose you can do an hdparm to find the best case scenario. Then there is jbartlett's diskspeed.sh.

DD from the drive can also give you a best case scenario for a benchmark.

 

30hrs/tb sounds about right for slower drives.

Link to comment

(1 TB) / (190 (MB / second)) =

1.4619883 hours

 

?!

 

We know that's not real world. That's best possible case, with minimal mechanics on outside tracks not even reading the filesystem, but a raw dd read. It will be less then that, generally averaging 80-120MB/s when you add in filesystem reads,seeks to files, etc, etc.

 

I've gone through an md5deep on 2 3TB 5900 rpm drives, It's almost done at the 10hr mark.

Then there is the other activity I've been doing on the drives such as finds down the drive to log files to a file list and file tree walks into the sqlite database.  Looks like I'm averaging 80MB/s for the md5deeps.

Link to comment

Will btrfs tell you if you have a corrupted file or bit rot automatically or only after running scrub? I am trying to decide xfs or btrfs for all my array devices, leaning towards xfs for now since even redhat 7 uses that as the default now, I have a feeling btrfs still has about a year- two to be considered stable enough.

Link to comment

Will btrfs tell you if you have a corrupted file or bit rot automatically or only after running scrub?

 

A notice will be placed in the DMESG queue. IF the file is read and the checksum is incorrect.

I'm not sure of syslog at this point.

From there you have to identify the file and/or initiate recovery from a snapshot or backup.

 

I am trying to decide xfs or btrfs for all my array devices, leaning towards xfs for now since even redhat 7 uses that as the default now, I have a feeling btrfs still has about a year- two to be considered stable enough.

 

I would suggest XFS at this point in time. Many of the hardware PVR vendors use it (from what I remember at least).

Using BTRFS for the cache would probably be OK.

Link to comment

I am trying to decide xfs or btrfs for all my array devices, leaning towards xfs for now since even redhat 7 uses that as the default now, I have a feeling btrfs still has about a year- two to be considered stable enough.

 

I think BTRFS is more than stable enough for use in unRAID.  It will only be using a subset of the feature set as I assume you will not be using redundant BTRFS arrays from within unRAID.  I chose BTRFS as Docker is in the future of unRAID and Docker requires BTRFS. I don't think you can go wrong with either XFS or BTRFS.

 

Has anyone run scrub yet? How long did it take and for how big hard drive. Some people on the internet have said 30 hours for 1tb..

 

I've run scrubs on all of my drives.  The time it takes depends on the amount of data on the drive.  Here are a few samples...

 

root@server:~# btrfs scrub status /mnt/disk1
scrub status for a970b6b4-73f9-47c1-9d5e-bd24ea5d89ed
scrub started at Thu Sep 18 01:00:01 2014 and finished after 836 seconds
total bytes scrubbed: 84.73GiB with 0 errors

root@server:~# btrfs scrub status /mnt/disk2
scrub status for 6307f35a-ce94-49d0-b615-7f0c328ae7e7
scrub started at Thu Sep 18 01:15:01 2014 and finished after 12279 seconds
total bytes scrubbed: 947.09GiB with 0 errors

root@server:~# btrfs scrub status /mnt/disk3
scrub status for f76eee1c-fba1-4683-9257-288c1309a004
scrub started at Thu Sep 18 01:30:01 2014 and finished after 4599 seconds
total bytes scrubbed: 368.77GiB with 0 errors

root@server:~# btrfs scrub status /mnt/disk4
scrub status for 48a85482-b5e1-468f-9f19-6559336357e7
scrub started at Thu Sep 18 01:45:01 2014 and finished after 3777 seconds
total bytes scrubbed: 277.72GiB with 0 errors

root@server:~# btrfs scrub status /mnt/disk5
scrub status for 61fe317a-4d73-46d3-8351-341011f6c3ad
scrub started at Thu Sep 18 02:00:01 2014 and finished after 8253 seconds
total bytes scrubbed: 529.73GiB with 0 errors

 

And this thread is now completely off topic for this forum topic....  So I'll drop it.  :)

Link to comment

Join the conversation

You can post now and register later. If you have an account, sign in now to post with your account.
Note: Your post will require moderator approval before it will be visible.

Guest
Reply to this topic...

×   Pasted as rich text.   Restore formatting

  Only 75 emoji are allowed.

×   Your link has been automatically embedded.   Display as a link instead

×   Your previous content has been restored.   Clear editor

×   You cannot paste images directly. Upload or insert images from URL.