Data Integrity Enabled by Default on BTRFS Array File Systems?


Recommended Posts

I bought an UnRAID license a few weeks ago based on being able to use btrfs on drives in the array. I wanted the data integrity that btrfs can provide. In other words, check-summing on both the metadata and the data. A Youtube video I was watching last night stated that in UnRAID, only the metadata has check-summing enabled, not the data.

 

I need to get an authoritative answer on whether that is true or not, since my purchase decision to go with UnRAID was based on it using btrfs with full data integrity (both metadata and file data) enabled.

 

Link to comment

By default system shares are set to NOCOW, this also disables data checksum, any user share will default to COW and data checksums, also you can change that for the system share (though it needs to be recreated), that is set on on the share page settings, currently "auto" means yes, so data checksums are enable.:

 

imagem.png.e039c01dfe9f1af9b18fdd590d7eb9fb.png

Link to comment

Thanks for the quick and concise reply. That answers my question.

 

What will be the system's (UnRAID combined with btrfs) response to a data block checksum error on a user share while serving the file over the network? Will the corrupt data be served, the transfer gracefully aborted, or some sort of system panic (ungraceful exit)? Is the file containing the bad block deleted, marked bad, or simply left alone?

 

 

Link to comment
4 minutes ago, Toadster said:

What will be the system's (UnRAID combined with btrfs) response to a data block checksum error on a user share while serving the file over the network?

You'll get an I/O error and the copy/read will fail, this is standard for any filesystem that does checksumns, zfs is the same, checksum error will also be logged on the syslog.

Link to comment

Thanks. That closes my question.

 

For giggles, I'll create a 2-disk btrfs array today, put a character string in a text file on the btrfs file system, move that data drive to a Windows machine and change one character in the string with WinHex, and put the drive back in the UnRaid array, then read the text file over the network.

Link to comment

>...if you change the file outside the filesystem and write it back new checksums will be created for that block(s).

 

You may be thinking of drive-level checksums used transparently on native drive sectors by the drive firmware to detect read errors. You are correct that regardless of operating system or file system, sector drive-level checksums will be recalculated by the drive firmware during any sector write operation. But drive-level sector checksums are transparent to btrfs. I don't want to create a drive-level read error. I don't know if that possible without hacking the drive's firmware. I want to create a btrfs file checksum mismatch at the OS level. This is my goal: The drive will read the sector storing the target character string, pass it up to the OS, the OS will re-calculate the btrfs checksum of the file or block, and compare it to the checksum stored in the btrfs metadata. What happens after that is the purpose of my test.

 

From my limited reading of the btrfs documentation, btrfs data checksums are calculated by the computer OS and stored in data structures in the btrfs metadata area. It's not that simple, but basically that.

 

The point in moving the drive to a foreign OS to booger the data is to prevent btrfs from recalculating it's data checksums and updating it's metadata.

 

WinHex disk editor can search raw sectors for character or hex strings and present the sector data in hexadecimal for editing and then write the sector back to the drive. Perfect for my needs.

 

Reference:

https://btrfs.wiki.kernel.org/index.php/Btrfs_design#Btree_Data_structures

 

See section heading "Files".

 

My only goal here is to verify that btrfs data integrity is enabled and that the system will trap them when detected. The details of how it's done doesn't matter to me other than understanding just enough to set up a test and run it so I can move on.

Link to comment
13 minutes ago, Toadster said:

WinHex disk editor can search raw sectors for character or hex strings and present the sector data in hexadecimal for editing and then write the sector back to the drive. Perfect for my needs.

I misunderstood what you said, I though you wanted to copy the file to windows, change it there, then copy back, operating directly on the sectors works, some time ago I tested in a similar way, by using dd on linux to write zeros on top of the data, that will also work.

Link to comment

>Maybe obvious, modifying an array disk outside the array invalidates parity. 

Yes, but UnRAID shouldn't detect that unless I check or rebuild parity, so it won't affect my test. (Unless parity is checked during network share reads, and I don't suspect it is, is it?)

 

I pulled my data off the UnRAID NAS temporarily to a backup server yesterday with TerraCopy verify-after-write, and reset the UnRaid configuration today to set up for the test: One data disk and one parity disk to keep it simple. Both were "precleared" overnight with no SMART changes reported in the logs except temp change and power-on hours. They're both newer WD Blues using CMR.

Link to comment

The test is finished.

 

Short Story:

1.) UnRAID failed to abort the file transfer from the network share to a Windows 10 client upon detection of the btrfs data corruption.

2.) The btrfs data corruption was detected and logged in the system log.

3.) A manual scrub of the array data drive containing the corruption did detect the corruption.

4.) The transferred file contained a 24K section of bytes filled with zeros (value 0x00) that encompassed the corruption byte.

 

The Test Setup:

I created a roughly 200MB file containing pseudorandom byte data. The size was chosen to be big enough to preclude btrfs from stuffing the data away in metadata somewhere. Random data was used so any compression algorithm would store the data uncompressed. If the data were compressed, I would be unable to locate the test character string on the drive with a hex editor.

 

On a Windows machine, in roughly the middle of the random data file, I placed the string 'BrightLightsBigCity'. This would be the string that would have the corruption modification made after the data file was in place on the UnRaid array drive.

 

The 200MB test file was moved from the Windows client to a public share on UnRAID via the network. The parity drive had regenerated during the afternoon and was valid.

 

UnRaid was gracefully shut down and the array data drive moved to a Windows machine.

 

WinHex was used to open the raw drive for searching and editing. The test phrase 'BrightLightsBigCity' was found with a text search, and one character of the phrase changed to introduce corruption. 'BrightLightsBigCity' was changed to 'BrightFightsBigCity'. The sector containing the edit was written back to the disk and the system shut down.

 

The array drive was put back into UnRaid, the machine started, and the corrupted file read from the public network share. No transfer errors occurred, the transfer finished normally, and the file length of the retrieved test file was correct.

 

The btrfs data error was logged in the system log.

 

A manual btrfs scrub of the array data drive was ordered and also caught the corruption.

 

A byte-by-byte comparison of the corrupted test file before it was moved to UnRaid was made with the file read from the UnRaid share. A 24K long section of zeroed data encompassed and overwrote the corruption phrase. It looks to me like btrfs or the OS zeroed-out the bad btrfs block containing the corrupted phase and simply passed the block along to the Windows client retrieving the test file from the public UnRAID share.

 

Conclusion:

a.) A btrfs share created with default setting will detect data file corruption.

b.) The btrfs block of the file containing that corruption will be zeroed-out and served to at least Windows 10 network clients without aborting the transfer.

c.) Btrfs data corruption errors are written to the UnRaid system log.

d.) A manual scrub of an array data drive containing a btrfs data corruption will be caught and logged.

 

Issues:

The anticipated behavior of UnRaid was that the transfer of a data file with btrfs data corruption would be aborted mid-transfer.

 

 

Jan 21 19:37:09 UNRAID kernel: BTRFS warning (device md1): csum failed root 5 ino 320 off 100376576 csum 0xae5d491d expected csum 0x5271ff7e mirror 1
Jan 21 19:37:09 UNRAID kernel: BTRFS error (device md1): bdev /dev/md1 errs: wr 0, rd 0, flush 0, corrupt 1, gen 0
Jan 21 19:37:09 UNRAID kernel: BTRFS warning (device md1): csum failed root 5 ino 320 off 100376576 csum 0xae5d491d expected csum 0x5271ff7e mirror 1
Jan 21 19:37:09 UNRAID kernel: BTRFS error (device md1): bdev /dev/md1 errs: wr 0, rd 0, flush 0, corrupt 2, gen 0
Jan 21 19:38:49 UNRAID emhttpd: cmd: /usr/local/emhttp/plugins/dynamix/scripts/tail_log syslog
Jan 21 19:39:12 UNRAID emhttpd: cmd: /usr/local/emhttp/plugins/dynamix/scripts/tail_log syslog
Jan 21 19:39:56 UNRAID emhttpd: cmd: /usr/local/emhttp/plugins/dynamix/scripts/tail_log syslog
Jan 21 19:40:04 UNRAID ntpd[1364]: kernel reports TIME_ERROR: 0x41: Clock Unsynchronized
Jan 21 19:42:23 UNRAID emhttpd: cmd: /usr/local/emhttp/plugins/dynamix/scripts/tail_log syslog
Jan 21 19:45:00 UNRAID root: Fix Common Problems Version 2020.12.27
Jan 21 19:45:01 UNRAID root: Fix Common Problems: Warning: Plugin Update Check not enabled
Jan 21 19:45:04 UNRAID root: Fix Common Problems: Other Warning: Background notifications not enabled
Jan 21 19:46:51 UNRAID emhttpd: cmd: /usr/local/emhttp/plugins/dynamix/scripts/tail_log syslog
Jan 21 19:46:58 UNRAID emhttpd: cmd: /usr/local/emhttp/plugins/dynamix/scripts/tail_log syslog
Jan 21 19:47:37 UNRAID emhttpd: cmd: /usr/local/emhttp/plugins/dynamix/scripts/tail_log syslog
Jan 21 19:52:00 UNRAID emhttpd: cmd: /usr/local/emhttp/plugins/dynamix/scripts/tail_log syslog
Jan 21 19:52:02 UNRAID kernel: BTRFS warning (device md1): csum failed root 5 ino 320 off 100376576 csum 0xae5d491d expected csum 0x5271ff7e mirror 1
Jan 21 19:52:02 UNRAID kernel: BTRFS error (device md1): bdev /dev/md1 errs: wr 0, rd 0, flush 0, corrupt 3, gen 0
Jan 21 19:52:09 UNRAID emhttpd: cmd: /usr/local/emhttp/plugins/dynamix/scripts/tail_log syslog
Jan 21 20:02:47 UNRAID ool www[12056]: /usr/local/emhttp/plugins/dynamix/scripts/btrfs_scrub 'start' '/mnt/disk1' '-r'
Jan 21 20:02:47 UNRAID kernel: BTRFS info (device md1): scrub: started on devid 1
Jan 21 20:02:53 UNRAID kernel: BTRFS warning (device md1): checksum error at logical 2109837312 on dev /dev/md1, physical 3191967744, root 5, inode 320, offset 100376576, length 4096, links 1 (path: test/BrightLightsBigCityBig)
Jan 21 20:02:53 UNRAID kernel: BTRFS error (device md1): bdev /dev/md1 errs: wr 0, rd 0, flush 0, corrupt 4, gen 0
Jan 21 20:02:53 UNRAID kernel: BTRFS info (device md1): scrub: finished on devid 1 with status: 0
Jan 21 20:04:09 UNRAID kernel: BTRFS warning (device md1): csum failed root 5 ino 320 off 100376576 csum 0xae5d491d expected csum 0x5271ff7e mirror 1
Jan 21 20:04:09 UNRAID kernel: BTRFS error (device md1): bdev /dev/md1 errs: wr 0, rd 0, flush 0, corrupt 5, gen 0

 

 

1248409381_2021-01-2120_03_19-UNRAID_DeviceMozillaFirefox.png.a6b3b4620bcc482e644e903e69c258ce.png

 

 

309544225_2021-01-2120_07_24-UNRAID_SyslogMozillaFirefox.thumb.png.29a36bb47a1b275067a012cd6206313b.png

Link to comment
6 hours ago, Toadster said:

The anticipated behavior of UnRaid was that the transfer of a data file with btrfs data corruption would be aborted mid-transfer.

There was an issue with some releases that had Samba aio enable that would prevent the i/o error on checksum error, but it should be disable on current release, what release were you using? Any custom settings?

Link to comment

>...what release were you using?

6.9.0-RC2

 

>Any custom settings?

None that I'm aware of. My use case is a plain vanilla NAS. Stability and data integrity are the only requirements. No Dockers, VMs, SSD cache drives, speed tweaks or 'stupid user tricks' allowed. The only sharing is through SMB. There is an APC UPS connected via USB.

 

Let me be very clear after running the test: UnRAID fully meets my needs. All I require is the ability to run a monthly manual scrub on the btrfs file systems to check for metadata and data corruption. Whether UnRAID aborts a share transfer of a btrfs file with data corruption is irrelevant to me.

 

The UnRAID NAS was holding working copies of important personal files, not a hoard of bootleg movies. If I lose the files, that's fine, I have orderly offline and cloud backups, but corruption must not sneak in.

 

 

 

Link to comment

The command completed without error.

 

1.png.8874e30944ae4c5c234bcbbeeb1735e2.png

 

 

Also, the files are the same size but not identical. (Probably has a segment filled with zeros again.)

 

2.png.63f1ca5ff5dcddfea9b76eef34edc8bd.png

 

3.thumb.png.66a5c5496545c7c2861d661a70c7b0f3.png

 

I'll re-flash the UnRAID USB boot drive, bring up the system from scratch, and make sure this isn't something unique to my system.

 

Edited by Toadster
Forgot the system log output
Link to comment

Just had a though, you used users shares, I've always tested on disk shares, please try:

cp /mnt/disk1/test/file /mnt/disk1/test/newfile

I won't be close to a server for the rest of the day, so can only test tomorrow, dollars to doughnuts that is the issue, IIRC FUSE uses some direct flag/option on mount, and I know that can cause the exact issue you're seeing.

Link to comment

The UnRAID system is gone. I re-flashed the USB drive with the latest stable release to bring up a clean system for testing.

 

>...you used users shares, I've always tested on disk shares...

Disk share vs user share, I'll have to look that up, but I'll try it. Only been using UnRAID the last few weeks.

 

Link to comment
1 hour ago, Toadster said:

Disk share vs user share, I'll have to look that up,

In a nutshell, disk shares are the contents of each individual disk. The don't show up on the network by default, but you can turn that option on if you want. User shares are the combined view of all identically named root folders on each of the individual disks. The reason disk shares aren't shown by default is that it's easy for an inexperienced user to corrupt files if they combine file operations between the two different schemes. Until you figure out exactly how disk and user shares interact, don't mix them in file operations. Copy or move only disk to disk, or user to user, never disk share to user share. After you have a handle on exactly what's going on behind the scenes, it's easy to avoid the pitfalls and you can navigate as you please between the two.

 

Novice users often see their files in two different named locations and think they accidentality created duplicates somehow, or try to take shortcuts with file management resulting in corrupting their files.

 

Quick example. /mnt/disk1/shareA/Subfolder1/text.txt is /mnt/user/shareA/Subfolder1/text.txt, so deleting the first file makes both go away. If you instead manually create /mnt/cache/shareA/Subfolder1/text.txt so the same folder and file name are on both drives, then the user share named shareA will only show one of the two files, and if you delete the file visible in the user share, it will delete one of the files, and nothing will appear to happen because the user share will start showing the other file. Once you understand the nuances, you can do some interesting tricks with it.

Link to comment

I flashed the UnRAID USB drive with v6.8.3, zeroed the hard drives, and brought up a system with a single public share and re-ran the test.

 

A copy command at the terminal of the on-disk test file aborts with an error. A copy command of the test file through FUSE completes normally, without error.

 

Upgraded UnRAID to v6.9.0-RC2 and same result.

 

The issue has been around a while, apparently.

 

Link to comment

Join the conversation

You can post now and register later. If you have an account, sign in now to post with your account.
Note: Your post will require moderator approval before it will be visible.

Guest
Reply to this topic...

×   Pasted as rich text.   Restore formatting

  Only 75 emoji are allowed.

×   Your link has been automatically embedded.   Display as a link instead

×   Your previous content has been restored.   Clear editor

×   You cannot paste images directly. Upload or insert images from URL.