Jump to content
We're Hiring! Full Stack Developer ×

File hashes, protection against bitrot and other data corruption failure


Helmonder

Recommended Posts

BTRFS has a check sum feature that is part of the filesystem.

 

PAR2 can be used to checksum, and possibly repair, a set of files.

I plan to create a program to sweep the folders and create folder.par2 set of files for each set of files that exist in a folder.

 

It was advised many moons back to do this on a larger set of data.

In that experiment, I flattened the directory structure. (details withheld to prevent boredom). 

In that case, I had way too many files for it to be a feasible method of validation and protection.

Then there were the memory requirements for such a large set of data. It often triggered an OOM condition causing the par2 program to get murdered by the kernel.

 

This is why I had planned to do it on a directory by directory basis, naming them folder.par2 for consistency.

Link to comment

I know that BTRFS is designed to protect against bit rot.

 

From what I've read, with BTRFS the protection comes from a backup and/or snapshot.

Otherwise you are just warned there is a checksum error.

 

Therefore, unless you backup and/or snapshot data on an array drive, I'm not sure it could protect you.

It looks as though you will be informed via dmesg.

 

I'm not an expert on this, so please correct me if I'm wrong.

Link to comment

PAR2 Command line example per folder.

I'm providing this as food for thought and/or educate those who might gain from the idea.

In this example. I create a folder.par2 file, delete a file and try to reconstruct it.

I'm not all that familiar with par2's usage, but it may provide food for discussion on creating a new verification/repair mechanism.

 

root@unRAID:/mnt/disk3/Music/music.mp3/Chill/Various Artists/Cocktail Lounge Session, Vol. 2# par2 create -n1 -r10 folder.par2 *
par2cmdline version 0.4, Copyright (C) 2003 Peter Brian Clements.

par2cmdline comes with ABSOLUTELY NO WARRANTY.

This is free software, and you are welcome to redistribute it and/or modify
it under the terms of the GNU General Public License as published by the
Free Software Foundation; either version 2 of the License, or (at your
option) any later version. See COPYING for details.

Block size: 102628
Source file count: 16
Source block count: 2000
Redundancy: 10%
Recovery block count: 200
Recovery file count: 1

Opening: 01 Enrico Donner - Abstract Dream.mp3
Opening: 02 Christian Hornbostel - Waiting At Potsdamer Chaussee (Re-Edit).mp3
Opening: 03 Rey Salinero - Enigma.mp3
Opening: 04 The Sura Quintet - Sunrise 4 You.mp3
Opening: 05 Kaxamalka - Suwawa.mp3
Opening: 06 Blue Wave - The Life Before.mp3
Opening: 07 Miraflores - Luz de Tavira.mp3
Opening: 08 Baghira - Underwater.mp3
Opening: 09 Cane Garden Quartet - Chillaxin'.mp3
Opening: 10 Lovers in Motion - Blue Morning Expressions.mp3
Opening: 11 Aquarius - Candles in Love.mp3
Opening: 12 DJ Riquo - My Lucky Day (Feat. Saba Rock).mp3
Opening: 13 Peter Linski Experience - Everything Flows.mp3
Opening: 14 Don Gorda Project - Feelin Free.mp3
Opening: 15 The Sura Quintet - Discovering Who You Are.mp3
Opening: folder.jpg
Computing Reed Solomon matrix.
Constructing: done.
Wrote 16776800 bytes to disk
Wrote 3748800 bytes to disk
Writing recovery packets
Writing verification packets
Done

root@unRAID:/mnt/disk3/Music/music.mp3/Chill/Various Artists/Cocktail Lounge Session, Vol. 2# ls -l
total 220323
-rw-rw-rw- 1 rcotrone 522 13303153 2014-09-16 12:07 01\ Enrico\ Donner\ -\ Abstract\ Dream.mp3
-rw-rw-rw- 1 rcotrone 522 13814172 2014-09-16 12:07 02\ Christian\ Hornbostel\ -\ Waiting\ At\ Potsdamer\ Chaussee\ (Re-Edit).mp3
-rw-rw-rw- 1 rcotrone 522 13162074 2014-09-16 12:07 03\ Rey\ Salinero\ -\ Enigma.mp3
-rw-rw-rw- 1 rcotrone 522 13749337 2014-09-16 12:07 04\ The\ Sura\ Quintet\ -\ Sunrise\ 4\ You.mp3
-rw-rw-rw- 1 rcotrone 522 14102476 2014-09-16 12:07 05\ Kaxamalka\ -\ Suwawa.mp3
-rw-rw-rw- 1 rcotrone 522 14522543 2014-09-16 12:07 06\ Blue\ Wave\ -\ The\ Life\ Before.mp3
-rw-rw-rw- 1 rcotrone 522 15142166 2014-09-16 12:07 07\ Miraflores\ -\ Luz\ de\ Tavira.mp3
-rw-rw-rw- 1 rcotrone 522 13042954 2014-09-16 12:07 08\ Baghira\ -\ Underwater.mp3
-rw-rw-rw- 1 rcotrone 522 12562325 2014-09-16 12:07 09\ Cane\ Garden\ Quartet\ -\ Chillaxin'.mp3
-rw-rw-rw- 1 rcotrone 522 13522609 2014-09-16 12:07 10\ Lovers\ in\ Motion\ -\ Blue\ Morning\ Expressions.mp3
-rw-rw-rw- 1 rcotrone 522 14098314 2014-09-16 12:07 11\ Aquarius\ -\ Candles\ in\ Love.mp3
-rw-rw-rw- 1 rcotrone 522 13590523 2014-09-16 12:07 12\ DJ\ Riquo\ -\ My\ Lucky\ Day\ (Feat.\ Saba\ Rock).mp3
-rw-rw-rw- 1 rcotrone 522 12876860 2014-09-16 12:07 13\ Peter\ Linski\ Experience\ -\ Everything\ Flows.mp3
-rw-rw-rw- 1 rcotrone 522 13825605 2014-09-16 12:07 14\ Don\ Gorda\ Project\ -\ Feelin\ Free.mp3
-rw-rw-rw- 1 rcotrone 522 13042999 2014-09-16 12:07 15\ The\ Sura\ Quintet\ -\ Discovering\ Who\ You\ Are.mp3
-rw-rw-rw- 1 rcotrone 522    77499 2014-09-16 12:07 folder.jpg
-rw-rw-rw- 1 root     522    44272 2014-09-16 13:18 folder.par2
-rw-rw-rw- 1 root     522 20892676 2014-09-16 13:18 folder.vol000+200.par2

root@unRAID:/mnt/disk3/Music/music.mp3/Chill/Various Artists/Cocktail Lounge Session, Vol. 2# ls -l --si
total 226M
-rw-rw-rw- 1 rcotrone 522 14M 2014-09-16 12:07 01\ Enrico\ Donner\ -\ Abstract\ Dream.mp3
-rw-rw-rw- 1 rcotrone 522 14M 2014-09-16 12:07 02\ Christian\ Hornbostel\ -\ Waiting\ At\ Potsdamer\ Chaussee\ (Re-Edit).mp3
-rw-rw-rw- 1 rcotrone 522 14M 2014-09-16 12:07 03\ Rey\ Salinero\ -\ Enigma.mp3
-rw-rw-rw- 1 rcotrone 522 14M 2014-09-16 12:07 04\ The\ Sura\ Quintet\ -\ Sunrise\ 4\ You.mp3
-rw-rw-rw- 1 rcotrone 522 15M 2014-09-16 12:07 05\ Kaxamalka\ -\ Suwawa.mp3
-rw-rw-rw- 1 rcotrone 522 15M 2014-09-16 12:07 06\ Blue\ Wave\ -\ The\ Life\ Before.mp3
-rw-rw-rw- 1 rcotrone 522 16M 2014-09-16 12:07 07\ Miraflores\ -\ Luz\ de\ Tavira.mp3
-rw-rw-rw- 1 rcotrone 522 14M 2014-09-16 12:07 08\ Baghira\ -\ Underwater.mp3
-rw-rw-rw- 1 rcotrone 522 13M 2014-09-16 12:07 09\ Cane\ Garden\ Quartet\ -\ Chillaxin'.mp3
-rw-rw-rw- 1 rcotrone 522 14M 2014-09-16 12:07 10\ Lovers\ in\ Motion\ -\ Blue\ Morning\ Expressions.mp3
-rw-rw-rw- 1 rcotrone 522 15M 2014-09-16 12:07 11\ Aquarius\ -\ Candles\ in\ Love.mp3
-rw-rw-rw- 1 rcotrone 522 14M 2014-09-16 12:07 12\ DJ\ Riquo\ -\ My\ Lucky\ Day\ (Feat.\ Saba\ Rock).mp3
-rw-rw-rw- 1 rcotrone 522 13M 2014-09-16 12:07 13\ Peter\ Linski\ Experience\ -\ Everything\ Flows.mp3
-rw-rw-rw- 1 rcotrone 522 14M 2014-09-16 12:07 14\ Don\ Gorda\ Project\ -\ Feelin\ Free.mp3
-rw-rw-rw- 1 rcotrone 522 14M 2014-09-16 12:07 15\ The\ Sura\ Quintet\ -\ Discovering\ Who\ You\ Are.mp3
-rw-rw-rw- 1 rcotrone 522 78k 2014-09-16 12:07 folder.jpg
-rw-rw-rw- 1 root     522 45k 2014-09-16 13:18 folder.par2
-rw-rw-rw- 1 root     522 21M 2014-09-16 13:18 folder.vol000+200.par2

root@unRAID:/mnt/disk3/Music/music.mp3/Chill/Various Artists/Cocktail Lounge Session, Vol. 2# par2 verify folder.par2
par2cmdline version 0.4, Copyright (C) 2003 Peter Brian Clements.

par2cmdline comes with ABSOLUTELY NO WARRANTY.

This is free software, and you are welcome to redistribute it and/or modify
it under the terms of the GNU General Public License as published by the
Free Software Foundation; either version 2 of the License, or (at your
option) any later version. See COPYING for details.

Loading "folder.par2".
Loaded 34 new packets
Loading "folder.vol000+200.par2".
Loaded 200 new packets including 200 recovery blocks

There are 16 recoverable files and 0 other files.
The block size used was 102628 bytes.
There are a total of 2000 data blocks.
The total size of the data files is 204435609 bytes.

Verifying source files:

Target: "01 Enrico Donner - Abstract Dream.mp3" - found.
Target: "02 Christian Hornbostel - Waiting At Potsdamer Chaussee (Re-Edit).mp3" - found.
Target: "03 Rey Salinero - Enigma.mp3" - found.
Target: "04 The Sura Quintet - Sunrise 4 You.mp3" - found.
Target: "05 Kaxamalka - Suwawa.mp3" - found.
Target: "06 Blue Wave - The Life Before.mp3" - found.
Target: "07 Miraflores - Luz de Tavira.mp3" - found.
Target: "08 Baghira - Underwater.mp3" - found.
Target: "09 Cane Garden Quartet - Chillaxin'.mp3" - found.
Target: "10 Lovers in Motion - Blue Morning Expressions.mp3" - found.
Target: "11 Aquarius - Candles in Love.mp3" - found.
Target: "12 DJ Riquo - My Lucky Day (Feat. Saba Rock).mp3" - found.
Target: "13 Peter Linski Experience - Everything Flows.mp3" - found.
Target: "14 Don Gorda Project - Feelin Free.mp3" - found.
Target: "15 The Sura Quintet - Discovering Who You Are.mp3" - found.
Target: "folder.jpg" - found.

All files are correct, repair is not required.

root@unRAID:/mnt/disk3/Music/music.mp3/Chill/Various Artists/Cocktail Lounge Session, Vol. 2# rm folder.jpg 
root@unRAID:/mnt/disk3/Music/music.mp3/Chill/Various Artists/Cocktail Lounge Session, Vol. 2# par2 verify folder.par2
...
Target: "14 Don Gorda Project - Feelin Free.mp3" - found.
Target: "15 The Sura Quintet - Discovering Who You Are.mp3" - found.
Target: "folder.jpg" - missing.

Scanning extra files:


Repair is required.
1 file(s) are missing.
15 file(s) are ok.
You have 1999 out of 2000 data blocks available.
You have 200 recovery blocks available.
Repair is possible.
You have an excess of 199 recovery blocks.
1 recovery blocks will be used to repair.

root@unRAID:/mnt/disk3/Music/music.mp3/Chill/Various Artists/Cocktail Lounge Session, Vol. 2# par2 repair folder.par2
...
Verifying source files:

Target: "01 Enrico Donner - Abstract Dream.mp3" - found.
Target: "02 Christian Hornbostel - Waiting At Potsdamer Chaussee (Re-Edit).mp3" - found.
Target: "03 Rey Salinero - Enigma.mp3" - found.
Target: "04 The Sura Quintet - Sunrise 4 You.mp3" - found.
Target: "05 Kaxamalka - Suwawa.mp3" - found.
Target: "06 Blue Wave - The Life Before.mp3" - found.
Target: "07 Miraflores - Luz de Tavira.mp3" - found.
Target: "08 Baghira - Underwater.mp3" - found.
Target: "09 Cane Garden Quartet - Chillaxin'.mp3" - found.
Target: "10 Lovers in Motion - Blue Morning Expressions.mp3" - found.
Target: "11 Aquarius - Candles in Love.mp3" - found.
Target: "12 DJ Riquo - My Lucky Day (Feat. Saba Rock).mp3" - found.
Target: "13 Peter Linski Experience - Everything Flows.mp3" - found.
Target: "14 Don Gorda Project - Feelin Free.mp3" - found.
Target: "15 The Sura Quintet - Discovering Who You Are.mp3" - found.
Target: "folder.jpg" - missing.

Scanning extra files:

Repair is required.
1 file(s) are missing.
15 file(s) are ok.
You have 1999 out of 2000 data blocks available.
You have 200 recovery blocks available.
Repair is possible.
You have an excess of 199 recovery blocks.
1 recovery blocks will be used to repair.

Computing Reed Solomon matrix.
Constructing: done.
Solving: done.

Wrote 77499 bytes to disk
Verifying repaired files:

Target: "folder.jpg" - found.

Repair complete.

root@unRAID:/mnt/disk3/Music/music.mp3/Chill/Various Artists/Cocktail Lounge Session, Vol. 2# ls -l folder.jpg
-rw-rw-rw- 1 root 522 77499 2014-09-16 13:20 folder.jpg

Link to comment

With unraid moving away from REISERFS and the advice not to use BTRFS on production systems I think we should need a solution for XFS.. Or even better, a solution that is not dependent on the filesystem.

 

Although I am not afraid of tinkering I would personally need to find something that just works.. It would also need to be able to either:

 

- stay up to date automatically, OR

- run periodically identifying which files have changed since last time (and notifying for that, cause a change could be correct OR bitrot..

Link to comment

Archived

This topic is now archived and is closed to further replies.

×
×
  • Create New...